I recently attended BigDataLDN, a technology conference and exhibition at Olympia London. As the name suggests, we spent two days focused on all things data.
Unlike some of my colleagues, I’m not a data expert, so this was an opportunity for me to get a feel for the industry first hand. With the advent of accessible AI and LLMs, data engineering is quickly increasing in importance. And although there’s a market need for all teams to be more literate in this space, the ability to capture and harness good quality data continues to be the real work that needs doing.
Inspiring sessions from Jaguar TCS, Department of Education, and Women in Data prove that great innovation is already happening. Jaguar are using Formula E track data to improve both electric car performance, but also efficiency for their upcoming exclusively electric consumer vehicles. The DoE now has accurate data on nationwide school attendance by midday, every single day. The panel by Women in Data showed their growth with more than 33,000 members strengthened by the presence of key female role models and as evidenced by representation from the Met Police, Dstl and MoD.
1. AI Will Drive Support for Data Teams
A central theme among several speakers was the number of data teams struggling with their budgets. Niamh O’Brien from Fivetran stated that tech departments typically have a budget 10 times that of their data teams. Strange that the level of investment in AI and LLMs doesn’t match the current hype, especially with everyone looking to integrate AI into their products. It’s likely the growth potential of leveraging AI will force companies to invest more in data.
The pushback to investing in data projects seems to be the unreliable return on investment. Even if we acknowledge there is some inherent risk in data projects due to the exploratory nature of their outcomes, it doesn’t account for the staggering statistic that 85% of data projects fail. Leveraging AI does nothing to address the failure rate either, so everyone’s left wondering what can be done.
2. Good Development Practices Save Data Projects
Enter Jesse Anderson with one of the most popular talks, “Why Most Data Projects Fail.” Speaking to a packed, standing-room-only audience like a pack of Swifties, Jesse gave a talk on agile project management 101! It was “deliver what you can first, watch out for scope creep, and make sure you’ve got a balanced team”. It was all absolutely correct, absolutely obvious, and absolutely a surprise to me that such a topic was covered.
“Choice of data platform isn’t the issue, projects don’t fail because you chose Snowflake over Databricks” said Jesse. However, it’s easy to see why people might get bogged down by that decision.
3. The Data Landscape Needs Some Noise Reduction
This diagram from FirstMark Venture Capital appeared in at least five presentations, and for good reason. It represents probably the most significant challenge beginners to the industry face. As I walked around the exhibit, it was impossible to distinguish the offerings of many of the vendors with a number of stands offering the “solution to all data problems”. Exhibitors fell generally into one of two categories, the “we can do everything end to end” and the “we can do one specific thing well, but we can also do everything else” camp. Although the diagram above suggests everyone falls into distinct categories, I don’t think many of the vendors would put themselves in those restrictive boxes.
It struck me as an opportunity, as demonstrated by Jesse Anderson, to be an advisor in this space. For struggling data teams, but especially for newcomers, finding a partner to help cut through the noise in this increasingly complex space is essential.
Where Will Data Be Next Year?
I look forward to going back next year, a little better educated, and more equipped to cut through the noise. It’ll be exciting to see how much has changed in a space that is poised to expand exponentially with the advent of AI.
As the industry moves towards more accessible data science, I’m sure we’ll see continued emphasis on the fundamentals of data quality and engineering. The industry wants business stakeholders to be using AI/LLMs to complete tasks that were previously only feasible by engineers. Excitingly, OpenAI’s DevDay proves that we are already close to that reality.