|
Purchase Preference Analaysis and Trend Forecast
-- A Fuzzy Utility
Function Approach to Data Mining
(Click here for DataX Catalog)
I. Brief
Background on Consumption Theory
Consumer Behaviors and Preference
One consumer would in general have different consumption behaviors or preferences
from another. He may spend money on computers and technical books, while the other
may spend on clothing and food. Availability of this information on consumer preference
will be of great value to a marketing company, a bank, or a credit card company that can
use this information to target different groups of consumer for improved response rate
or profit. By the same token, information on consumption preference of the
residents in one specific region can help businesses in planning their operations
in this region for improved profit. Therefore, it is very important to have a tool
that can help analyze consumers behaviors and forecast the changes in
purhcase patterns and changes in purchase trend.
Fuzzy Consumption Utility Functions-based
Utility Theory
In studying advanced methodology for consumption behaviors, AI researchers at Zaptron
Systems have developed the so called fuzzy utility functions that can model and
describe the consumption behaviors of a target consumer group.
Consumption Utility - it is a criterion
(or index) used to evaluate the effectiveness of customers consumption. A low value of
consumption utility, say 0.15, indicates that a customer is not satisfied with the
consumption of a certain commodity; while high value, say 0.96, indicates that the
customer is very satisfied. There are formal theories on utility, including ordinal
utility, cardinal utility and marginal utility.
Consumption utility function - The
behavioral characteristics of human beings can be represented by the concept of consumption
utility, and consumption utility function is the mathematical description of this
concept. In addition, human consumption behaviors are determined by the following
two types of factors:
objective factors - the physical,
chemical, biological and artistic properties of goods;
subjective factors - consumer's interest,
preference and psychological state.
II. Fuzzy
Utility Function for Consumption
Fuzzy Set Theoretical Approach -- In
fact, consumption utility is a fuzzy concept. To model the above subjective factors, fuzzy
set theory is used to describe different levels of consumers satisfaction with respect to
various consumption plans (spending patterns), such as "not satisfied,"
"somehow satisfied," "very satisfied," and etc. Mathematically,
the fuzzy utility function is a more accurate measure on the consumption
utility. It can describe the relationships among spending, price, consumption
composition (decomposition), preference and subjective measure on commodity or service
values.
Mathematical model of fuzzy consumption
functions - to model the consumption of one commodity, a fuzzy function Ui must satisfy
the following three conditions (Xi is the spending on i-th commodity):
Ui: Xi
--> Ui
Ui --> 1 if Xi is infinity, Ui
--> 0 if Xi is 0
The first derivative of Ui w.r.t. Xi must
be positive,and the second derivative negative.
One commonly used Ui has the following
Fi(.) form (subscript i is for the i-th commodity or service):
Ui = Fi(Di, Pi, Si, C)
For N commodities or services, the total
fuzzy utility function U is expressed as a weighted sum of utility function Ui
for individual category, i.e.,
U = W1*U1 + W2*U2 + W3*U3 + ..... + WN*UN
= W1*F1 + W2*F2 + W3*F3 + ..... + WN*FN
where
U - Total utility function for all
possible consumption categories. It is an indicator of consumer's level of satisfaction,
with U =1 for maximum satisfaction and U = 0 for non satisfaction. 0.0 < U < 1.0.
Ui or Fi - The utility function for i-th
consumption category, with Ui =1 for maximum satisfaction and Ui = 0 for non satisfaction.
0.0 < Ui < 1.0.
Di - a parameter related to customer's
subjective evaluation on consumption (depreciation)
Qi = Di*Pi, appreciation (not
"satisfaction"), consumer's subjective measure on the value of a
consumption
Pi - price of the i-th commodity or
service
Si - percentage of spending on i-th
commodity, 0 < Si < 1 and Si = Ai + Bi/C
Ai
- Limit spending percentage of the i-th commodity in total spending
when average personal income level increases so greatly that one could afford to
spend to "limit" (as much as he or she desires) with unlimited financial
resources. We always have 0.0 < Ai < 1.0.
Bi - The trend of change in Ai due to
change in average personal income level. Note the relationship between Ai an Bi as
follows:
Bi > 0 if Ai decreases, and
Bi < 0 if Ai increases.
C - total spending amount, in dollars, on
all N commodities or services
Wi - a weighting factor representing a
consumer's preference to the i-th commodity (0<Wi<1)
Computation of Parameters {Di, Qi, Si,
Ai, Bi, Wi} - Based on the Maximum Utility Principle, they can be computed by solving a
set of complex mathematical equations. Zaptron's DataX software suite has special
modules that compute them.
III. A Case
Study using DataX
Problem Background - The annual data of
the average spending amount (in dollars) on 5 categories by customers in a remote rural
area (with severe electric power shortage, indicated by the smaller spending amount in the
"Energy" column of Table-1) are available (see Table-1 below). Table-2
shows the parameters computed using the Zaptron model.
Table-1 Annual average consumption data
for 5 categories (unit: 100 US$):
Year |
Food |
Clothing |
Energy |
Housing |
Supplies |
Entertainment |
1980 |
171.9 |
37.2 |
26.2 |
19.6 |
29.8 |
5 |
1981 |
175 |
39.3 |
26.8 |
100 |
35 |
5.2 |
1982 |
155.7 |
36.3 |
24.1 |
49.7 |
37.9 |
4.2 |
1983 |
188.1 |
43.4 |
28.4 |
43.2 |
44.9 |
6.6 |
1984 |
220.3 |
40.2 |
27.5 |
70.8 |
56.9 |
7.2 |
Table-2 Parameters
computed using fuzzy consumption function:
Consumption
Category |
Limit
Spending
Ai (%) |
Spending
Trend
Bi |
Preference
Wi |
Appreciation
Qi |
Food |
36.8 |
52.081 |
0.149 |
1086.84 |
Clothing |
2.90 |
29.169 |
0.385 |
84.94 |
Energy |
1.80 |
20.329 |
0.383 |
52.68 |
Housing |
41.4 |
-88.76 |
0.041 |
1221.63 |
Supplies |
15.4 |
-13.09 |
0.026 |
453.45 |
Entertainment |
1.70 |
-0.454 |
0.017 |
51.20 |
(1) Limit Spending Percentage (parameter
Ai in %) has the largest value for "housing" consumption, and smallest value
for "energy" consumption. These Ai values are correct because people in
general would always like to live in a better house (or to continue spending on home
improvement), and do not spend extra money on energy, when their incomes permit (with
unlimited income).
(2) Consumption Preference (parameter
Wi) for consumers in this rural area (with severe power shortage) is
from high to low for clothing (Wi = 0.385), energy (Wi = 0.383), food (Wi = 0.149),
housing (Wi = 0.041), supplies (Wi = 0.026) and entertainment (Wi = 0.017). These numbers
show that people most prefer buying new clothing (Wi = 0.385). They also show a high Wi
value (Wi = 0.383) for energy, This means that people in this rural area have to pay
high attention (strong preference) to energy consumption to ensure the basic human
needs (for instance, they need to use homemade firewood to cook food daily), given the
natural condition of severe power shortage in the area. Naturally, preference for
food (0.149) and housing (0.041) is higher than that for supplies (0.026) and
entertainment (0.017). The preference for entertainment is the lowest with Wi =
0.017.
(3) Spending Trend (parameter Bi) -
Parameter values for Bi show the positive or negative changes (spending trend) in
Limit Spending Percentage (parameter Ai). Positive Bi values, such as those for
food, clothing and energy of Table-2, indicate a decreasing trend in limit
spending percentage on those categories. Negative values, such as those for
housing, supplies and entertainment of Table-2, indicate an increasing trend in
limit spending percentage on those items. In other words, with unlimited personal
incomes, people would spend more (to limit) on housing, supplies and entertainment, and
less on food, clothing and energy. This makes sense to most people in most areas of
the world.
(4) Appreciation (parameter Qi)
- these values show how the people evaluate these 5 types of
consumption, and large numbers indicate higher value. Table-1 show that Appreciation
(Qi) values are in the same order as Limit Spending Percentage (Ai)
values, and this result makes sense and is expected.
IV. Consumption Trend
Forecast
The method can be used in combination with
the time series analyais method to
predict the purchase trend and customer's satisfaction level. Here tTwo examples are given
to show the forecast result of (1) purchase trend and (2) consumer's satisfaction
level.
4.1 Forecast on Future Consumption Trend
We used the data for the years from 1980 to 1984 given in Table-2 above to forecast the
consumption trend using the Time
Series Analysis Tool of DataX. In forecasting, it is assumed that the annual
spending rate increases by 8% and population increases by 0.9%. The forecast results so
obtained for the years 1985, 1990, 1995 and 2000 are listed in Table-3. The following
observations can be made from Table-3 and are consistent with the Engles Principle in
economics.
1) The Limit Spending Percentage (Ai) is
largest for housing, increasing from 22% in 1985 to 35% in 2000.
2) The Limit Spending Percentage (Ai) is smallest for food, decreasing from 48.4% in 1985
to 40% in 2000.
3) The percentage on clothing will slightly decrease from 1985 to 2000
4) The percentage on supplies and entertainment will experience slow increase.
Table-3 Forecast on Limit Spending
Percentage (Ai parameter in %)
Category |
1985 |
1990 |
1995 |
2000 |
Food |
48.4 |
44.7 |
42.1 |
40.5 |
Clothing |
9.2 |
7.2 |
5.8 |
4.9 |
Energy |
6.2 |
4.8 |
3.8 |
3.2 |
Housing |
21.9 |
28.2 |
32.4 |
35.3 |
Supplies |
12.5 |
13.4 |
14.0 |
14.5 |
Entertainment |
1.64 |
1.67 |
1.69 |
17.0 |
4.2 Forecast on Consumer's Satisfaction Level
Utility functions discussed above present the consumer's satisfaction level on individual
and combined consumption. They can be used to forecast the customer satisfaction levels on
various consumption categories. Table-4 gives the results computed by using
DataX.. The table shows that the
satisfaction level for housing is lowest (U4 = 0.0788), therefore the demand on housing (living) will be high - more than
double every 5 years from 1985 to 2000.
Table-4 Forecast on Consumption
Utility (Consumer Satisfaction Level Ui)
Utility Function |
1985 |
1990 |
1995 |
2000 |
Total Utotal |
0.1477 |
0.2074 |
0.2877 |
0.3911 |
Food U1 |
0.1840 |
0.2412 |
0.3180 |
0.4170 |
Clothing U2 |
0.3924 |
0.4350 |
0.4922 |
0.5659 |
Energy U3 |
0.4177 |
0.4585 |
0.5133 |
0.5840 |
Housing U4 |
0.0788 |
0.1440 |
0.2301 |
0.3419 |
Supplies U5 |
0.1183 |
0.1800 |
0.2631 |
0.3701 |
Entertainment U6 |
0.1358 |
0.1963 |
0.2777 |
0.3825 |
V. Conclusions
This artical introduces a new data
mining method for purchase preference analysis and trend
forecast. A fuzzy set theoretical definition of consumption utility function is
introduced, computation of model pareamaters is discussed. Examples are given to show the
efficacy of this innovative approach to data mining in business and finance. |