Home » Blog » Powered by Data, Driven by Science

Powered by Data,
Driven by Science

  • by

At PickleTech we are born to provide Data Science services that help domain experts solve impactful problems in fields such as Health, Sports Performance, Supply Chain and Mobility.

The Context

In recent years, data availability has increased at an impressive speed, throughout many different fields. This is thanks to the increasing efficiency of data recording devices and processing tools. That is amazing! And it actually has enabled the creation of many high potential applications. Those span over a wide range of areas: from fundamental research in particle physics to drug discovery solutions.

However, working with practitioners in fields such as Health or Supply Chain, it is daunting to sense there is sometimes a perception that several of the solutions being developed are not meeting their expectations. The impact from (broadly understood) Data Science solutions is not yet at par with its hype.

The Challenge

It is not simple to summarize why this is happening, specific causes are dependent on the application and the field. But there are some patterns recurrently appearing. These patterns combine several data quality issues: such as sampling bias, incomplete records by design, and other miscommunicated assumptions on the data recording and treatment processes.

Furthermore, biased solution designs combined with improper validation frameworks lead to applications that perform very differently to what the flawed development was pointing at. Some developed solutions are even simply unfit for real-world use. For instance when they are only able to process a very limited piece of information with respect to the more holistic view of a domain expert, or when they can not be interpreted by them.

Have a look at this article to read about these problems in the context of AI solutions developed for the Covid crisis.

Regardless of current trends, the truth is many of these patterns have nothing to do with failures on software development or on the industrialization of a data science solution, but talk instead about statistics and science fundamentals.

Why PickleTech

At PickleTech we put special emphasis on providing our clients with the best Data Science methodology to deal with these issues above. We are determined to work on solutions that do translate into business and organization impact.

In our experience, encouraging, protecting, and boosting two core aspects in the data science development methodology, beat most of the times these problems above. And that drives us to more impactful solutions at the end of the development pipeline.

  • Close collaboration with domain experts. As data scientists, we think diving deep into the project context together with domain experts is fundamental to understand the business and user needs. This is mandatory to dissect their operational framework and frame the right business questions. And it enables us to master all the details required to craft a solution with the maximum impact. This collaboration sets in addition some common ground useful to highlight the importance of development aspects such as solution interpretability, key to build the trust, and thus the impact of a model.

  • Rigor and a robust science methodology. There is Science in Data Science, even if it is many times horribly overlooked. A proper hypothesis testing and development schema; validation frameworks to ensure a true statistical learning; and a fair assessment of the final solution impact are essential for long lasting solutions. We believe impact is real only when it is properly measured. This is why we never compromise scientific integrity: Data Science must be evidence-based.

Of course, we value the importance of analytic tools, including a wide range of algorithms, from data engineering and ETL; to advanced statistics, Machine Learning and Deep Learning models; throughout the whole spectrum of Artificial Intelligence techniques.

Plus some techniques that have a large impact on many organizations, even while being many times undervalued: time series, hypothesis testing or bootstrapping techniques to mention a couple. We have the knowledge to craft models for your problem and data, and we exploit that when necessary. We love research and innovation, it is where we come from!

We thrill working with large data sets, heterogeneous, with constraints and limitations. At the same time, we are experienced scientists. Together we have years of experience finding the way to dissect problems, combining those aspects, and extracting useful insights for you in highly competitive fields such as Sports Performance, Health, Supply Chain, International development and Security, and High Energy Physics.  Powered by Data, Driven by Science.

We will love to hear from you and help you with the exciting problems that affect your operations!