December 2, 2019
Aptly depicted in the ‘Data science hierarchy of needs’ below, DS/ML related requirements of most companies tend to be in several, sometimes all layers of the pyramid.
With our DataFactory and
Learning Tracks offerings we cover 4 layers
from ‘Move/Store’, i.e. DataStore
& DataFactory
to ‘Explore/Transform’
(data wrangling and feature engineering) to ‘Aggregate/Label’ (data inference, analytics)
through ‘Learn/Optimize’ which stands for modeling and prediction using both
classic ML and deep learning algorithms.
Qauzativ
was born out of the need for end-to-end solutions to typical DS/ML tasks,
and out of the frustration to find them.
We required a complete solution for credit risk modeling that would start
from putting raw data into a coherent structure, proceed to generating
variables (features) from that data and then would go through all iterative
DS/ML steps from data exploration to model deployment.
There are tutorials, textbooks and online courses here and there, most of them contain simplified (on purpose) toy examples, and yet it is difficult to find a solution as complete as described above.
In the process of building our products and services, we developed code libraries that are essentialy a series of DS/ML tutorials.
Alongside ML code libraries, we built DataStore
, a designated data silo
for data science projects that transforms raw XML credit reports into
hundreds of variables and handles other data sources such as ERP and
CRM systems that typically exist in corporate IT ecosystems.
We compiled all our documentation as RMarkdown files and placed them into a knowledgebase, a static website for documentation management, that is so good in terms of ease-of-use and functionality that we decided to offer it as a standalone product – as a hub for centralized storage, maintenance and sharing of codebase & documentation that can be built in a few hours and maintained by any DS/ML team member.
And also we built a team of people with doctorate and graduate degrees from well-known universities who teach and practice DS/ML on a daily basis.
We offer all that to the public, including the open source edition
of DataStore
, as our response to the perceived lack of coherent practice-based
tutorials that actually help to address real-life ML problems.
Our reasons for doing so are as follows:
DS/ML is the game changer
We are in the middle of an AI and machine learning boom that is bringing major changes. It is already affecting our lives as we use data-driven products by IT giants on a daily basis and expect the same experience from banks, telecoms, retailers. Machine learning will change landscapes of many industries.
There will be winners and losers
The leaders will get decisive competitive advantages over the laggards by building new products and services with the use of ML. Business processes will be automated more efficiently, the winners will find more ways to cut costs, increase revenues, attract and retain more customers.
Three things, among others, are key to bringing AI to the mainstream
The technical expertise to develop and innovate, the organizational skills to successfully implement and embed this innovation into our daily work, and the leadership know-how to make best use and wisely invest. So, however you slice it, success in AI depends first on expertise and education.