Journal of theoretical and applied electronic commerce research
version ISSN 0718-1876
J. theor. appl. electron. commer. res. vol.7 no.1 Talca Apr. 2012
Journal of Theoretical and Applied Electronic Commerce Research
ISSN 0718-1876 Electronic Version VOL 7 / ISSUE 1 / APRIL 2012 / 64-76 © 2012 Universidad de Talca - Chile
Comparison Shopping Agents and Online Price Dispersion: A Search Cost based Explanation
Bhavik K. Pathak
Indiana University South Bend, School of Business and Economics, South Bend, USA, firstname.lastname@example.org
Search costs and consumer heterogeneity are two important explanations for the price dispersion in the brick and mortar (B&M) markets. Comparison shopping agents (CSAs) provide a single click decision support for consumers' purchasing related decision problems and reduce their search costs by providing detail price dispersion related information. Contemporary researchers in IS observe that even with such negligible search costs, price dispersion still continues in the online markets. Consumer heterogeneity and retailer heterogeneity have been agreed upon as two primary explanations for online price dispersions. In this paper, popular CSAs are analyzed to check if they provide complete and accurate price dispersion information. It is shown that because of the selection bias and temporal delay in updating information, contemporary CSAs may not present complete and accurate price dispersion information. In order to reach to an optimal purchasing decision, consumers may have to rely on a sequential search across multiple CSAs or browse through various retailers. This research adds a search cost dimension to explain the continuance of price dispersion in the online markets.
Keywords: Price dispersion, Search costs, CSA, Shopbots, Comparison shopping agents
The advent of the Internet has provided sellers a cost effective platform to extend their reach beyond any geographical or temporal barriers. Moreover, because of the centralized inventory, online retailers have expanded their product portfolio and embraced long tail phenomenon. The shoppers' purchasing decision has become complex because of ever increasing number of options in terms of sellers and product varieties. Alternate business models of information intermediaries have emerged that provide detail information to shoppers at minimum level of efforts. Comparison shopping agents are such infomediaries that facilitate detail seller and product related information to shoppers. Researchers in information systems (IS), marketing, and economics have been intrigued by the role these CSAs play in reducing price dispersion in the online markets.
Analysis of ongoing price dispersion in online markets, even in the presence of CSAs, is an interesting problem and researches in marketing, economics, and information systems have studied this issue. Prior literature discusses consumer heterogeneity and search costs based justification for the existence of price dispersion in the brick and mortar markets , , . The CSAs provide detail price and retailer comparison based on the product information supplied by the shoppers. This reduces the search costs associated with determining price quotes from various retailers and hence should remove the factors that tend to facilitate price dispersion in the market. However, researchers have found that just like their B&M counterparts, the online retail market also displays spatial and temporal price dispersions , , , , . Much of the prior research acknowledges the role of CSAs in reducing search costs and provides non-search costs based explanations for the existence of price dispersion in online markets. Previous studies have provided consumer heterogeneity, seller heterogeneity, and price discrimination as potential factors that contribute to online price dispersion. The underlying presumption in these studies is that because of CSAs, the impact of search costs is negligible in online shopping.
This research studies the online price dispersion problem and analyzes the role of search costs in explaining online price dispersion. One click price comparison provided by CSAs can potentially remove search costs associated with comparison shopping. However, in this paper, I study whether comparison shopping information provided by CSAs is accurate and complete. If such information is incomplete and inaccurate then shoppers may have to conduct additional search costs. Thus, the objective of this research is to study and explain the search costs based justifications for online price dispersion. There are two major reasons that CSAs may provide incomplete and inaccurate price dispersion information: temporal delay and selection bias.
In the past, CSAs mainly relied on the real time Web-scrapping to obtain pricing data from online retailers . This data collection methodology has some major problems including query response time , data quality and merchant obfuscation , and merchant blocking . Advances in information technology has created alternate means of sharing or collecting data (e.g. XML, FTP) and today's CSAs mainly rely on obtaining pricing data directly from the retailers. Although, due to lower menu costs, online sellers tend to make more SKU level price changes ,  and sellers may provide price feed data to CSAs at discrete time periods rather than synchronize their price changes with the CSAs in real time. Thus, at any given point of time, CSAs may not present accurate price dispersion due to such temporal lag in updating price information.
Moreover, majority of CSAs provide free comparison shopping services to shoppers and primarily rely on listing and click-through fees from online retailers for their revenue. They charge various fees such as referral fees, merchant storefront fees, search result ordering fees, advertisement charges, revenue sharing programs and, listing fees to online retailers. Every CSA has its own pricing structure. For example, shopping.com's referral fees in cost-per-click (CPC) ranges from $0.05 to $0.95 depending on a product category. For some other CSAs, such as bizrate.com, retailers have to decide how much they want to pay for the CPC charges. The more they pay, the higher their listing appears in the search results. Typically, the selection and ordering of online retailers in the search results of CSA is based on competitive bids. For the same product, because of such selection bias, spatial price dispersion may differ from one CSA to another. Such economic incentive-based selection bias of CSAs may restrict the merchant participation in their programs. Hence, the merchant-supported revenue model of the CSAs may not present complete price dispersion to shoppers and may lead towards suboptimal purchasing decisions.
In the absence of complete and accurate price dispersion information, shoppers may pursue their search across multiple CSAs or even various online merchants. In this paper, these two phenomena of temporal delay and selection bias are analyzed to provide search cost based explanations for the online price dispersion. The results show a significant delay in price update at the online CSAs across different retailers and products. It clearly shows that participating sellers differ significantly from one CSA to another and hence selection bias is evident. Due to selection bias and temporal delay in update, CSAs may present incomplete and inaccurate price dispersion information and hence may not optimize shoppers' purchasing decisions without comprehensive search across multiple CSAs or online retailers. Prior researchers have provided consumer heterogeneity, retailer heterogeneity, obfuscation, and price discrimination based justifications for online price dispersion. One of the major contributions of this paper is to add search costs based factor in explaining online price dispersion.
In this paper, the phenomenon of temporal delay and selection bias and its impact on online price dispersion is analyzed. The rest of this paper goes as follows. Section 2 presents theory and hypotheses. Section 3 provides data collection approach. Results and analysis are provided in section 4 and section 5 concludes this paper with the discussion, research implications and limitations.
2 Theory and Hypotheses
Price dispersion has been of high concern for both online as well as B&M retailers. In the B&M case, it has been shown that the price dispersion can exist when firms have perfect information about buyers' reservation prices and demand functions . Prior researchers discuss that the information acquisition is costly and hence it has an impact on market equilibrium. It has been shown that consumers differ in their costs of information acquisition. In such cases, firms, as discriminating monopolists, exercise price discrimination. However, the scope of such price discrimination is significantly restricted in a competitive market with perfect information .
The online market is considered as a competitive market with a very high degree of information transparency. The impact of practicing price dispersion on the online retailers may be significantly higher given the emergence of the CSAs, where consumers can compare prices across multiple retailers just by a few clicks. Prior researchers in various fields have extensively studied CSAs on various issues including price dispersion , , , , , , consumer behavior , , , , , information search costs , , -, and recommender systems , , . The focus of this research is on price dispersion. A brief survey about CSAs has been published by Pan et al. . They observe that the majority of studies in this area have recorded substantial price dispersion in the online markets and in general online price dispersion is not smaller than the traditional markets. Moreover, they conclude that over a period of time, the online price dispersion has declined, but continues to be substantial. They suggest that multichannel retailers generally charge higher prices and they are an important source for price dispersion in online markets. Other researchers also show the existence of the price dispersion even after controlling for the shipping costs in the online markets .
Although the emergence of CSAs has made it easy for the consumers to obtain price quotes from various merchants, price dispersion still exists in the online markets. The law of one price does not prevail in the online markets. Prior studies have provided heterogeneity based reasons for the continuing price dispersion in the online markets. Retailer heterogeneity has been described as one of the major explanations for the price dispersion in online markets . Retailers can be differentiated based on shopping experience, shipping, return policy, and other related services. Online tools and services such as Amazon.com's recommender systems and prime shipping (i.e. a subscription based service that provide free two-day delivery) create switching costs for consumers and hence contribute towards online price dispersion. Other major explanation for price dispersion is the consumer heterogeneity. Consumers may differ in terms of their brand loyalty  and the knowledge of prevailing pricing in the market , . Other researchers  show that when consumers are brand sensitive, retailers may adopt asymmetric mix pricing strategy wherein they have higher prices on average for majority of products but lower prices on some of the products. Smith  concludes that unlike B&M retailers where consumers make their purchasing decisions based on proximity of retailers, online retailers are more likely to be affected by brand name recall. Some prior studies have addressed the issue of service heterogeneity. Varian  predicts that online retailers will fall into two main categories: great service and higher prices, average service and lower prices. Service heterogeneity based price dispersion has also been empirically validated in prior studies . Price formats (e.g. EDLP Vs Hi/Lo) have also been identified as one of the potential sources for price dispersion . Search costs based explanation has been discussed as a major source of price dispersion in B&M markets. While the majority of these studies have focused on non-search cost based explanations for price dispersion, there is very little, if any, work done in explaining the search cost-led online price dispersion.
CSAs provide an easy access for comparing product and price information across multiple retailers. Consumers may make an optimal purchasing decision based on this information only if the information provided on these CSAs is complete and accurate. If the price dispersion information provided by CSAs is incomplete and inaccurate, consumers may have to perform an additional search in order to make informed and optimal purchasing decisions. These additional search costs need to be accounted for and can be one of the potential sources of online price dispersion. There are two specific characteristics of CSAs that creates such possibility of additional search costs.
2.1 Temporal Delay in Updating Price Information
Prior research in economics and marketing field has focused both on spatial and temporal price dispersion. Hal Varian was among the first few who described and studied temporal price dispersion. Temporal price dispersion is briefly characterized as follows.
"In a market exhibiting temporal price dispersion, we would see each store varying its price over time. At any moment, a cross section of the market would exhibit price dispersion; but because of the intentional fluctuations in price, consumers cannot learn by experience about stores that consistently have low prices, and hence price dispersion may be expected to persist ."
Temporal price dispersion has also been studied in the context of price rigidity. Price rigidity is the proposition that some prices change slowly in response to the market dynamics. Levy  discusses that price rigidity varies from market to market and this variation depends on the magnitude of menu costs. Menu costs are price adjustment costs. In the case of traditional retail markets, where menu costs are quite high, retailers have fewer incentives to adjust their prices in accordance with the market dynamics. However, menu costs for online retailers are very low and they may change their prices more frequently in response to the market dynamics . Price change decisions also depend on the market structure and prices are adjusted more frequently in markets with a large number of competitors. In online markets, especially in computer and electronics products, generally one observes a very large number of competitors and hence it is reasonable to estimate frequent price changes in the online markets. Further, these changes are irregular, random, or unpredictable and hence both consumers and competitors cannot learn about the stores that consistently sell at the lower prices , . Temporal price dispersion is also described as "hit and run pricing" and it has been shown that online retailers employ these kinds of pricing strategies .
One of the major implications of the temporal price dispersion, hit and run pricing, lower menu costs or price rigidity is that level of prices are unpredictable at any instant of time. CSAs present the price comparison results for the consumer's product search query and in order to maximize consumer utilities, it is extremely important for the CSAs to provide the most accurate price information. In order to provide such accuracy, ideally CSAs should query all online sellers at the time of consumer's search requests. In the past, majority of CSAs used to collect price data from the websites of the online retailers in real time. For each user query, CSAs would initiate a search across multiple merchants to download and parse web pages and finally present results to the user , . Although this approach was popular because of its merchant independence, it has some major limitations including data quality and merchant blocking problem  and obfuscation and bait-and-switch strategies by merchants , . Majority of CSAs collect data directly from retailers by using Web interfaces, Web services, APIs, or XML . This merchant-dependent approach has two major issues. First, it brings merchant-bias in CSA-based search results . Second, and more importantly, data may get collected or updated at discrete intervals which may lead towards temporal delay in updating price. This leads to the first hypothesis.
Hypothesis 1: There is a substantial temporal delay between online price adjustments by retailers and updating of such price adjustments on CSAs.
It is important to realize that due to this temporal delay, CSAs may present inaccurate price information to consumers which may result in suboptimal purchase decisions by consumers. The probability of such suboptimal decisions increases with the increase in temporal delay. Once consumers realize that there is a temporal delay in updating the information and CSAs may represent inaccurate price information, they may decide to visit each online store to validate the price displayed by CSAs. Most of the contemporary CSA research assume zero or negligible search costs. Because of temporal delay in price update, consumers may have to incur sequential search across multiple CSAs.
2.2 Selection Bias
Just like traditional retail markets, online markets also present substantial spatial price dispersion , , , , . Although there are many different theoretical rationales for price dispersion, rationales based on search costs and heterogeneity of consumers are dominant in the academic literature , , . The search costs based theory suggests that if there is a positive marginal cost for getting price information from each retailer then equilibrium price dispersion may exist. In the traditional retail market these search costs can be interpreted as the costs of making a trip to the retailer or calling a retailer . However, this theory assumes that consumers are homogeneous. Theory based on consumer heterogeneity assumes different types of consumers based on their awareness of price information. These consumers have been broadly classified into informed and uninformed consumers , . They assume that informed consumers have the knowledge of the entire distribution of offered prices and uninformed consumers know nothing about such distribution . Here, the emphasis is on the knowledge of the entire price distribution and in the absence of CSAs, it was assumed that this entire price distribution could have been obtained by 'listing services' or 'information clearinghouse'. Most of the research in the online price dispersion uses the model based on heterogeneity of consumers and assume negligible or zero search costs . It is assumed that CSAs provide a single click access to the complete price dispersion information.
The previous generation of the CSAs used to collect data directly from retailers by using Web scraping methods. However, now CSAs obtain data directly from retailers . The revenue model of the majority of CSAs is merchant-driven. Majority of CSAs have their price list for various product categories and charge substantial fees to list a vendor for a specific product category. They reflect revenue driven selection bias and hence limits the coverage of retailers in their data . CSAs have different criteria for selecting merchants (online retailers). Some CSAs list merchant prices only if a merchant provides XML price feeds. Some CSAs restrict the number of merchants whose price will be listed for each product category. Some CSAs demand high fees for listing and even ask for CPC charges from merchants. Not all the merchants can satisfy these criteria and hence it is not necessary that all merchants can participate in all CSAs. Retailers may take an optimal decision based on various criteria such as CPC charges and reach of CSAs, and participate in one or only few of the CSAs. For example, staples.com does not participate in pricegrabber.com's merchant program. Also, some stores do not participate at all. For example, radioshack.com does not participate in the merchant programs of any of the six major CSAs that have been analyzed in this research. Such limited coverage with major exclusions may result in substantially different price dispersion data between two different CSAs. Hence, Hypothesis 2: For the same product, price dispersions presented by two different CSAs differ significantly.
3 Data Collection and Methodology
Each of the hypotheses in the prior section requires a different set of data. While the first hypothesis addresses the inaccuracy in the information related to price dispersion because of a temporal delay in updating pricing information, the second hypothesis focuses on the incompleteness of the price dispersion related information due to selection bias among CSAs.
3.1 Temporal Delay
In order to calculate the temporal delay in updating price information on a CSA, two specific time-based references need to be determined:
1. Price adjustment time (PAT): This is a reference time at which the price of a product is changed by a retailer.
2. Price update time (PUT): This is a reference time at which the price of a product is updated by a CSA.
The temporal delay is the difference between PAT and PUT. It is important to realize that PUT may differ from CSA to CSA for the same merchant-product combination as each CSA may use a different data extraction technology or have a different data update procedure. Thus, in order to investigate the first hypothesis, it is important to carry out a comprehensive analysis using temporal delay data from the multiple CSAs. In this research, PUTs of six popular CSAs have been measured. The popularity of CSAs was measured by using Alexa.com rankings. It is a standard source that provides website rankings based on browsing behavior data.PUTs on CSAs always lag PATs on the merchants' websites. Hence, the measurement of PUTs requires a timely identification of PATs on merchants' websites.
Given the negligible menu costs, online price adjustments are frequent, random, and SKU-specific. It is practically challenging to identify PATs in such scenario. It can be measured only if all product prices on a merchant's website are continuously tracked. However, many online merchants and majority of click and mortar (C&M) merchants have their weekly sales circulars. In these circulars, they advertise their forthcoming price changes for selective products that go on for a sale on a specific day, typically Sunday. Such price adjustment generally remains valid for at least a week. Moreover, some merchants release their weekly sales circulars much in advance. For example, Staples.com releases its sales circular on Thursday for the upcoming week starting from Sunday. These weekly circulars provide PATs for a selective set of products. Thus, latest by early Sunday morning, one can determine a set of products whose PATs are on Sunday. Typical online C&Ms follow a regular schedule to adjust their prices based on these weekly sales circulars. For example, Staples change their prices based on its sales circular at 6:00 am EST on Staples.com. Based on primary exploratory study, it was identified that by 8:00 am Sunday, all major C&M retailers change their prices based on an advertised sales circular for the week starting on Sunday. It is important to note here that not all products advertised in sales circular require price adjustments. Merchants even advertise those products whose prices remain unchanged from one week to another. These products are discarded from this study. Thus by Sunday, the information about the merchants and their list of product along with their prices (both for the previous week and the upcoming week) are captured. For all these products, for the sake of simplicity, 8:00 am Sunday is considered as PAT for this study. Appendix A provides a list of products, corresponding retailers and prices (previous week and upcoming week).
Prices for all of these product-merchant combinations were tracked twice a day (9:00 am and 9:00 pm) on six CSAs for the next four days. This provided the second set of time-based reference point at which the prices are updated on various CSAs. As the data are captured only twice a day, it does not reflect the exact PUT, but it provides realistic estimated time duration at which the price changes are updated on CSAs. Measuring the exact PUT for all product-merchant combination requires continuous monitoring of these CSAs, which may not be possible realistically. Moreover, prices are not tracked after four days as for a typical price adjustment which may remain valid only up to a week, four days passes a half-way milestone and any delay beyond this is measured simply as five days and more. As the sample products for this research vary from one week to another, practically monitoring prices of all products for a prolonged time period was logistically challenging. For this research, the sales circular data have been collected from three merchants. Merchants had been selected based on the availability of their sales circulars well before the PAT of 8:00 am Sunday. This makes it easy to observe and record both current and forthcoming prices of various products. Not all products from these sales circulars were selected. Certain products such as retailer-branded products, generic products, bundled promotions, and products without specific model number related information were excluded from the sample as obtaining price quotes for such items through six CSAs was not possible. For each product-retailer-CSA combination, the difference between PAT and PUT was calculated to determine the temporal delay in updating price information.
3.2 Selection Bias
In order to evaluate the completeness of price dispersion based hypothesis, one needs to determine a complete list of merchants selling a specific item. This may eventually require searching all potential merchants selling sample items. Such exploratory study may not be fruitful as it may leave many potential merchants out because of the long-tail phenomenon in electronic commerce. An alternate approach to show the incompleteness of price dispersion information on a CSA is to compare the price dispersion information on two different CSAs and evaluate the differences. If such difference is significantly different than it suggests the selection bias of individual CSAs. In this study, two popular CSAs, Shopping.com and Cnet.com are selected and the price dispersion information for a sample of products is compared.
Products were selected from Amazon.com's top 100 electronics category. Out of these 100 products, digital products such as software and computer games were discarded as well as other products which had bundle promotions. Certain other products such as battery chargers and generic items were not selected as well. In the absence of any specific product identifier, it was difficult to find price quotes of such products on CSAs. Our final list of products consisted of 41 items. Each of these items was searched on two different CSAs and detail information about their prices was collected. Aggregated information from this data is presented in Appendix B. Various measures were used to compare the price dispersion information from different CSAs. In appendix B, count represents the number of merchants selling an item. Max, Min, and Avg are the maximum, minimum, and average price of an item respectively. Stdev is the standard deviation of various price quotes. CV stands for the coefficient of variation and it is a ratio of standard deviation and mean of the price quotes. Coefficient of variation is a standard unit less measure for price dispersion and widely used in this field , , , .
4 Results and Analysis
Temporal delays in updating prices are calculated by subtracting PUTs and PATs. Overall results for temporal delay are provided in Appendix A. In order to analyze the selection bias of CSAs, price dispersion information on two CSAs is evaluated by using standard measures such as CV, price range, maximum and minimum price, average price, and standard deviations. Results for this comparison are provided in Appendix B.
The meaning of substantial temporal delay in hypothesis 1 may have different significance based on consumer preferences, degree of price changes, or product characteristics. Given the nature of data transfer between merchants and CSAs, it is practically challenging to have real time data update on CSAs. However, a delay of more than 24 hours is substantial enough to misrepresent the price information to a potentially large audience. For the purpose of this research, it is assumed that any delay of more than one day is substantial. It is important to note here that due to lower online menu costs, retailers may make random and frequent changes in price at the SKU level ,  and the combination of long tail and virtually global market reach  may create a substantial market for a given SKU on any given day. Hence, a delay of one day in updating price is substantial in today's fast changing marketplace. The null hypothesis suggests that the difference between PAT and PUT is less than a day and the alternative hypothesis claim that the delay is more than a day. It can be seen from Appendix A that for some products the temporal delay is clearly more than a day. For example, the difference between PUT on shopping.com and PAT on Staples.com for a 19" cRt monitor is two days.
One of the interesting observations from this data suggests that the temporal delay is a multi-dimensional phenomenon. Temporal delays in update vary across various retailers, CSAs, products, and even time periods. For example, as shown in figure 1, the temporal delays for two different retailers not only vary for individual retailers but also vary across different CSAs. In order to test hypothesis 1, a conservative measure of the minimum temporal delay is used to compare it with the notion of substantial temporal delay (i.e. one day). The minimum temporal delay for a merchant-product pair is the shortest time period required to update the price information on any of the six CSAs. This information is provided in the last column of Appendix A. For example, for all in one printer (product 2 in appendix B), the PUT on pricegrabber.com is three days against five days by other CSAs. Hence, according to this conservative measure minimum time to update has been considered as three days. The results for the t-test are shown in table 1.
As can be seen from table 1, hypothesis 1 is strongly supported. Thus, the difference between PAT on a merchant's website and PUT on a CSA is substantial. Comparison of these results across different CSAs is provided in Table 2.
As can be seen from Table 2, some CSAs, such as shopping.com, take on an average 3.39 days to update the price information. While other CSAs, such as Cnet.com take on an average 2.53 days to update price changes. Regardless of products, retailers, or time periods, these delays are substantial across all CSAs. If merchants start changing prices more frequently, then a CSA is more vulnerable to inaccurate information and in the absence of a sequential search or multiple click-through, consumers may make suboptimal purchase decisions.
Figure 1: Comparison of temporal delays in price update across different CSAs
To compare the price dispersion between two CSAs, standard measures of price dispersion have been used such as range, standard deviation, mean, number of sellers, and coefficient of variation. As can be seen from Appendix B, although minimum prices are very similar between two CSAs, other price dispersion parameters such as standard deviation and CV differ. Various descriptive statistics in Appendix B help in understanding the difference between the price dispersion information presented by two CSAs. Table 3 provides the summary of average price dispersion parameters. Figure 2 (a) and Figure 2 (b) provide comparison of this information between these CSAs. Table 3 provides the aggregated price dispersion measures.
Figure 2 (a): Comparison of average price dispersion parameters between two CSAs
Figure 2 (b): Comparison of average price dispersion parameters between two CSAs
One of the most important measures of price dispersion is coefficient of variation. It has been widely used for comparing price dispersion across two different populations , , . To test hypothesis 2, CVs of price dispersion from two CSAs are compared. The null hypothesis is that the difference of CV between two CSAs is zero. The t-statistic for a two-tailed test is -2.55 and p-level is 0.01.
Hence, hypothesis 2 is strongly supported. The price dispersions between two CSAs differ significantly and one CSA does not provide complete spatial price dispersion. In the absence of such incomplete price dispersion information, consumers may make suboptimal purchase decisions if they rely on a singular CSA. It is important to note here that the consumer's purchase decision may not be based on the minimum price alone. Consumers may rely on multiple factors such as price, retailer reputation, service, and shipping and return policies etc... and hence they may need complete information from potentially all sellers selling a specific product. Consumers may have to incur additional search costs in order to obtain complete price dispersion related information which may include visits to multiple CSAs or conducting a sequential search across all potential sellers of the product.
5 Discussions and Implications
Just like its B&M counterpart, the online retail market also displays spatial price dispersion. Consumer heterogeneity and search costs are the two primary reasons for the existence of the price dispersions in the online markets. Contemporary IS research shows that in the online markets, the law of single price does not prevail even with emergence of the CSAs. Consumer and retailer heterogeneities are provided as two primary explanations for this continuing price dispersion in the online markets. The fundamental contribution of this research is to provide a search-cost based explanation for online price dispersion.
The CSAs may not present complete price dispersion information because of their merchant-sponsored revenue model. CSAs selection criteria for merchants are predominantly revenue-based. Moreover, many merchants do not participate in the CSA-based referral programs. In this research, by comparing two different CSAs, it has been shown that the price dispersion information presented by an individual CSA significantly differs from the other CSA. Consumers may not make an optimal purchasing decision by relying on such incomplete price dispersion information from a single CSA and may have to search on multiple CSAs or even merchant websites in order to get comprehensive price dispersion information. Likewise, the CSAs may not present accurate price dispersion information because of their data collection methodology. Majority of contemporary CSAs receive pricing data from merchants at discrete intervals. Given lower menu costs, online merchants tend to change prices more frequently. However, it may not be feasible to update these SKU specific price adjustments to all CSAs in real time and hence at any given instance of time, the CSAs may present incorrect prices. Such realization may require consumers to click through multiple merchant websites to validate the pricing information.
Researchers have been intrigued by the ongoing existence of price dispersion in the online retail market. Prior research has provided retailer heterogeneity, consumer heterogeneity, and obfuscation based justifications for online price dispersion. This paper studies online price dispersion related information from various CSAs to evaluate the quality of information in terms of completeness and accuracy of price quotes from multiple retailers. The results clearly show that due to merchant-supported models of current CSAs and price quote data collections at discrete time intervals, the current generation of CSAs does not provide complete and accurate price dispersion. More importantly, the completeness and accuracy of the price dispersion varies from one CSA to another. Even if the user visits multiple CSAs, she may not be able to obtain complete and accurate price dispersion information. Under such conditions, users may have to incur additional search costs in order to make optimal purchasing decisions. Hence, the emergence of CSAs may not negate the search requirement and search costs should be considered as one of the potential explanations for the price dispersion in the online markets.
With regard to temporal delay, the results from appendix B show that temporal delays not only vary from one CSA to another but also differ from one retailer to another. More importantly, even for the same retailer, temporal delays vary from one product to another. For example, for all-in-one printers, the temporal delay for Circuitcity is 3 days for Epinions.com. For other CSAs, the temporal delay in updating price for all in one printer is 2 days. However, for digital cameras, the temporal delay for Circuitcity is at least 5 days for all CSAs. Thus, the price updates at different CSAs for different products and retailers happen at different times. In such cases, it is difficult for the user to predict any pattern and unless she visits individual retailer websites, it is difficult to ascertain the accuracy of a given price quote. Thus, sequential search across various retailers' websites is imperative in order to obtain accurate price dispersion information.
For selection bias, the coverage of retailers varies significantly from one CSA to another. Moreover, this coverage differs from one product to another. For example, as shown in appendix A, while Cnet.com covers 31 retailers for a product 3 (wireless desktop), shopping.com covers only 8 retailers. However, for a product 4 (digital camera), Shoppimg.com covers 49 retailers versus 34 in the case of Cnet.com. Likewise, minimum price may also differ from one CSA to another. For example, for the first product in the sample (digital camera), the minimum price at CNet is $219. However, the minimum price at Shopping.com for the same camera is $180. Such evidences clearly show that incomplete coverage by CSAs may provide suboptimal purchasing related decision support and hence may encourage additional search.
This research contributes towards our knowledge of the CSAs in multiple ways. First, while current literature explains the existence of online price dispersion by various phenomena including consumer heterogeneity , retailer heterogeneity , service heterogeneity , , pricing strategies , and retailer obfuscation , this research suggests that even with the presence of the CSAs, search costs may still persist in the online retail market because of selection bias and temporal delay. Thus this research provides a search cost based explanation for the existence of online price dispersion. Second, prior researchers have provided various suggestions for the improvement of CSAs including providing comprehensive product, price, promotion, and retailer specific search solutions , personalization , multi-parameter ordered list , and bundle product solutions , . This research suggests that the next generation of CSAs should focus on the basics first - that is to reduce the search costs by providing accurate and complete price dispersion information. Third, researchers have developed models to address certain limitations of CSAs such as selection bias and temporal delay. For example, it has been suggested that CSAs should use learning algorithms to conduct a selective search across a limited set of retailers in real time to assure optimal comparison shopping results with limited coverage . This research suggests that a combined approach of selective coverage and real time data extraction may optimize purchase decision making for the user.
This research has a several limitations. On the one hand these limitations restrict the generalization of the results of this study for different products and markets. On the other hand they provide excellent future research opportunities. First, majority of CSAs in our sample have merchant-supported revenue models. Some other CSAs like Bingshopping.com or Google products based on alternate revenue models are excluded from the research. Further research is required to compare the selection bias and temporal delay related issues among CSAs with different revenue models. Second, the results of this study are based on a limited set of data. In particular, only two CSAs are compared for the selection bias issue. Likewise, only three retailers are studied for the temporal delay related issue.
While six CSAs are analyzed for evaluating temporal delay, only two CSAs are studied for selection bias. The primary reason for this limited set of data is because of logistical reasons. For example, with regard to selection bias, obtaining price quotes for 41 products from on an average 22 retailers and two CSAs constitute 1760 price quotes in a limited time frame. As the selection of CSAs in this study is based on Alexa.com rankings and in general majority of popular CSAs operate similarly, it is reasonable to believe that the results of this study can be extended to online comparison shopping field in general. However, an extensive research is required to study the broader impacts. It is also interesting to determine the link between the prices shown on the CSAs, price dispersion, and its impact on consumers' buyer behavior. One of the potential future research areas is to conduct a lab study to analyze such relationships.
Third, the product sample for comparing selection bias mostly includes popular products. It remains to be seen how selection bias influences the price dispersion information for obscure items. Recent research  suggests that different product categories may respond differently in the online markets. The focus of this paper is on a single product category and it can be extended to analyze the impact of temporal delay and selection bias in different product categories. Fourth, this research does not measure or quantify the search costs and hence does not directly measure the impact of selection bias and temporal delay on search costs. Measuring such search costs is challenging given that it may require an exploratory study to check the availability, price, tax, and shipping related information on a large number of sellers. However, one can compare the differential impact of selection bias and temporal delay on search costs. As many retailers remain uncovered by various CSAs, one has to conduct a sequential search across individual retailers as well as multiple CSAs in order to get the complete price dispersion in the case of selection bias related issue. While, search cost related problems related to temporal delay may require less search costs because the user can click through hyperlinks and visit a limited set of retailers for validation purpose. Future research in this field should analyze the differential impact of these two phenomena on search costs and analyze the consumer product choice strategy under such situation in line with .
[I] M. Arbatskaya and M. R. Baye, Are prices 'sticky' online? Market structure effects and asymmetric responses to cost shocks in online mortgate markets, International Journal of Industrial Organization, vol. 22, no. 10, pp. 1443-1462, 2004. [ Links ]
 J. P. Bailey, Intemediation and electornic markets: Aggregation and pricing in Internet commerce, Ph.D. Thesis, Technology, Management and Policy, Massachusetts Institute of Technology, Cambridge, MA, 1998. [ Links ]
 J. Y. Bakos, Reducing buyer search costs: Implications for electronic marketplaces, Management Science, vol. 43, no. 12, pp. 1676-1692, 1997. [ Links ]
 M. Baye, J. Morgan and, P. Scholten, Persistent price dispersion in online markets, in The New Economy and Beyond: Past, Present and Future (D. W. Jansen, ed.). New York: Edward Elgar Publishing, pp. 122-143. [ Links ]
 M. Baye and J. Morgan, Temporal price dispersion: Evidence from an online consumer electronics market, Journal of Interactive Marketing, vol.18, no. 4, pp. 101-115, 2004. [ Links ]
 M. Baye and J. Morgan, Price dispersion in the lab and on the internet: Theory and evidence, The Rand Jourmal of Economics, vol. 35, no. 3, pp. 449-466, 2004. [ Links ]
 M. Baye and J. Morgan, Price dispersion in the small and in the large: Evidence from an internet price comparison site, The Journal of Industrial Economics, vol. 52, no. 4, pp. 463-496, 2004. [ Links ]
 K. Baylis and J. M. Perloff, Price dispersion on the internet: Good firms and bad firms, Review of Industrial Organization, vol. 21, no. 3, pp. 305-324, 2002. [ Links ]
 Brynjolfsson, Erik, Yu J. Hu, and Michael D. Smith, From Niches to riches: The anatomy of the long tail, Sloan Management Review, vol. 47, no. 4, pp. 67-71, 2006. [ Links ]
 E. Brynjolfsson and M. D. Smith, Frictionless commerce? A comparison of internet and conventional retailers, Management Science, vol. 46, no. 4, pp. 563-585, 2000. [ Links ]
[II] E. Brynjolfsson and M. D. Smith, The great equalizer? An empirical analysis of consumer choice behavior at an internet shopbot, Carnegie Mellon University, Tepper School of Business, 2007. [ Links ]
 R. K. Chellappa, R. G. Sin, and S. Siddarth, Price formats as a source of price dispersion: A study of online and offline prices in the domestic U.S. airline markets, Information Systems Research, vol. 22, no.1, pp. 269-288, 2011. [ Links ]
 P. Chen and L. Hitt, Measuring switching costs and the determinants of customer retention in internet-enabled businesses: A study of the online brokerage industry, Information Systems Research, vol. 13, no. 3, pp. 255274, 2002. [ Links ]
 K. Clay, R. Krishnan, and E. Wolff, Prices and price dispersion on the web: Evidence from the online book industry, The Journal of Industrial Economics, vol. 49, no. 4, pp. 521-539, 2001. [ Links ]
 E. Clemons, I. Hann, and L. Hitt,Price, Dispersion and differentiation in online travel: An empirical investigation, Management Science, vol. 48, no. 4, pp. 534-549, 2002. [ Links ]
 K. Crowston and I. MacInnes, The effects of market-enabling internet agents on competition and prices, Journal of Electronic Commerce Research, vol. 2, no. 1, pp. 1-22, 2001. [ Links ]
 G. Ellison and S. Ellison, Search, obfuscation, and price elasticities on the internet, Econometrica, vol. 77, no. 2, pp. 427-452, 2009. [ Links ]
 M. Fasli, Shopbots: A syntactice present, a semantic future, IEEE Internet Computing, vol. 10, no. 6, pp. 69-75, 2006. [ Links ]
 R. Garfinkel, R. Gopal, B. Pathak, and F. Yin, Shopbot 2.0: Integrating recommendations and promotions with comparison shopping, Decision Support Systems, vol. 46, no. 1, pp. 61-69, 2008. [ Links ]
 R. Garfinkel, R. Gopal, A. Tripathi, and F. Yin, Design of a shopbot and recommender system for bundle purchases, Decision Support Systems, vol. 42, no. 3, pp. 1974-1986, 2006. [ Links ]
 M. Goldmanis, A. Hortacsu, and C. Syerson, E-commerce and the market structure of the retail industries, The Economic Journal, vol. 120, no. 545, pp. 651-682, 2010. [ Links ]
 G. Haubl, B. Dellaert, K. Murray, and V. Trifts, Buyer behavior in personalized shopping environments, in Designing Personalized User Experiences in Ecommerce (C-M. Karat, J. O. Blom, & J. Karat, eds.). Norwell, MA, USA: Kluwer Academic Publishers, pp. 207-229. [ Links ]
 G. Haubl and K. Murray, Preference construction and persistence in digital marketplaces: The role of electronic recommendation agents, Journal of Consumer Psychology, vol. 13, no. 2, pp. 75-91, 2003. [ Links ]
 M. Janssen and J. L. Moraga-Gonzalez, Strategic pricing, consumer search, and the number of firms, The Review of Economic Studies, vol. 71, no. 249, pp. 1089-1119, 2004. [ Links ]
 D. Levy, M. Bergen, S. Dutta, and R. Venable, The magnitude of menu costs: Direct evidence from large U.S. supermarket chains, The Quarterly Journal of Economics, vol. 112, no. 3, pp. 791-825, 1997. [ Links ]
 A. Menczer, A. Monge, and W. Street, Adaptive assistants for customized e-shopping, IEEE Intelligent Systems, vol. 17, no. 6, pp. 12-19, 2002. [ Links ]
 A. Montgomery, K. Hosanagar, R. Krishnan, and K. Clay, Designing a better shopbot, Management Science, vol. 50, no. 2, pp. 189-206, 2004. [ Links ]
 J. Morgan, H. Orzen, and M. Sefton, An experimental study of price dispersion, Games and Economic Behavior, vol. 54, no.1, pp. 134-158, 2006. [ Links ]
 X. Pan, B. Ratchford, and V. Shankar, Price dispersion on the internet: A review and directions for future research, Journal of Interactive Marketing, vol. 18, no. 4, pp. 116-135, 2004. [ Links ]
 X. Pan, B. Ratchford, and V. Shankar, Can price dispersion in online markets be explained by differences in e-tailer service qualit?, Journal of the Academy of Marketing Science, vol. 30, no. 4, pp. 433-445, 2002. [ Links ]
 B. Pathak, A survey of the comparison shopping agent-based decisions support systems, Journal of Electronic Commerce Research, vol. 11, no. 3, pp. 177-192, 2010. [ Links ]
 P. Pendersen, Behavioral effects of using software agents for product and merchant, International Journal of Electronic Commerce, vol. 5, no. 1, pp. 125-141, 2000. [ Links ]
 J. Reiganum, A simple model of equilibrium price dispersion, Journal of Political Economy, vol. 87, no. 4, pp. 851-858, 1979. [ Links ]
 S. Salop and J. Stiglitz, Bargains and ripoffs: A model of monopolistically competitive price dispersion, Review of Economic Studies, vol. 44, no. 3, pp. 493-510, 1977. [ Links ]
 P. Scholten and A. Smith, Price dispersion then and now: Evidence from retail and e-tail markets, Advances in Applied Microeconomics, vol. 11, no. 11, pp. 63-88, 2002. [ Links ]
 M. Smith, The impact of shopbots on electronic markets, Journal of the Academy of Marketing Science, vol. 30, no. 4, pp. 446-454, 2002. [ Links ]
 A. Sorensen, Equilibirium price dispersion in retail market of prescription drugs, Journal of Political Economy, vol. 108, no. 4, pp. 833-862, 2000. [ Links ]
 S. Sproule and N. Archer, A buyer behavior framework for the development and design of software agents in e-commerce, Internet Research: Electronic Networking Applications and Policy, vol. 10, no. 5, pp. 396-405, 2000. [ Links ]
 B. Su, Consumer e-tailer choice strategies at on-line shopping comparison sites, International Journal of Electronic Commerce, vol. 11, no. 3, pp. 135-159, 2007. [ Links ]
 P. Todd and I. Benbasat, The use of information in decision making: An experimental investigation of the impact of computer-based decision aids, MIS Quarterly, vol. 16, no. 3, pp. 373-394, 1992. [ Links ]
 V. Trifts and G. Haubl, Information availability and consumer preference: Can online retailers benefit from providing access to competitor price information?, Journal of Consumer Psychology, vol. 13, no. 2, pp. 149-159,2003. [ Links ]
 H. Varian, A model of sales, American Economic Review, vol. 70, no. 4, pp. 651-659, 1980. [ Links ]
 R. Waldeck, Search and price competition, Journal of Economic Behavior & Organization, vol. 66, no. 2, pp. 347-357, 2006. [ Links ]
 Y. Wan and G. Peng, What's next for shopbots?, Computer, vol. 43, no. 5 pp. 20-26, 2010. [ Links ]
 B. Xiao and I. Benbasat, E-commerce product recommendation agents: Use characteristics, and impact, MIS Quarterly, vol. 31, no. 1, pp. 137-209, 2006. [ Links ]
 Y. Xu and H. Kim, Order effect and vendor inspection in online comparison shopping, Journal of Retailing, vol. 84, no. 4, pp. 477-486, 2008. [ Links ]
Appendix A: Comparison of Price Dispersion Information
Appendix B: Comparsion of Temporal Delay in Price Update
Received 28 April 2010; received in revised form 25 December 2011; accepted 29 January 2012