Some details
Overview:
The desire of our client was to obtain relevant information on how often products were added/removed and the price changes. That was what led us to create Octopus Webscraper. This is a data aggregator that parses the necessary data at a certain periodicity and adds the data to the database. Based on these data, the client receives the necessary statistics. It collects such data as bank name, product name,
interest rate, minimum investment, maximum investment, notice period, and account type.
What we did:
•managed the project from analysis of business requirements to the delivery
•created prototypes to demonstrate how information will be collected
•deployed the environment on Google App Engine
•updated parser, made more error-tolerant, added information about parser errors
•made a database for data storage
•output the API for obtaining data in a convenient JSON format
•added CRON task which collects customer data with certain frequency
Technologies:
•Python2.7
•MySQL, Cron Jobs, Rest API, Tornado
•Crawling
•HTTP, Google App Engine