Conclusions
The superiority of the linear regression method to the non-parametric bootstrap method suggests that the fineness of characteristics such as mileage and year matters. Though we divided cars into mileage and year quintiles for the bootstrap, this did not capture the fact that cars near the edges of quintiles may be more similar to members of other quintiles than to all of the members in their own quintile.
The three-factor price, velocity, and acceleration linear model performed even better than the linear model using features and prices at two points in time. This suggests that the overall shape of the bidding history has substantial predictive power. However, this model requires many bids in the previous bidding history -- at least 20 -- so combining these two models may lead to even better results, applicable to a wider range of auctions.
As a team, we learned quite a bit about the project, both in terms of gathering data as well as the actual analysis .
Some of the most frustrating moments came during the scrapping process, when we realized that we had to bypass Ebay's limit on how much data one can scrape on 1 IP address.
In addition, another frustration was the long run time of some of the algorithms, and due to the sheer size of the data, it typically took a while to even load the data into Python.
In terms of superlatives, it was a great experience taking 3 models and running with them because we all had different ideas/inspirations/motivations for our models, and to be able to explore that was very rewarding.
While some of our results were a bit disappointing, since we couldn't get near 100% accuracy, it was nevertheless a great learning experience, attempting to model the a real world phenomenon.
Further Works
For a discussion of each individual model's extensions, please see each page
Independent of models, some of the extensions of our project include:
1. Expand the dataset to include a more diverse array of auctioned items
2. Attempt to implement such modeling in other online auction atmospheres
3. Attempt to include Buy-it-now and other price characteristics into our model