Pain Points In An Engineering Organization

It’s been a few weeks since we made the tough decision to refocus and build in a different direction at Wildcard. The details are explained well by Jordan here, but the short version is that after three years of building awesome data technology with a great team, and releasing an acclaimed product, and despite hundreds of thousands of users, we determined that our app just wasn’t growing fast enough to lead to a viable business in the short term. It would be more responsible to our investors and everyone involved to take our remaining capital and work towards building something that’s both a great product and a great business.

Prior to the process of moving forward, we have taken some time to look back at the last three years, and do some retrospectives on our company, process, team, and technology. I thought that it may be useful to share some of the technical pain points that we felt as an engineering organization both here at Wildcard, and in the past at other companies I’ve been a part of. These are all known problems where there exist a number of companies, pieces of software, and recommended processes to provide solutions, however despite all of the tools and information you can find out there, my experience shows that engineering organizations generally struggle with these issues. Perhaps there’s opportunity to for the right companies to emerge to serve these needs.

  1. Local developer environment setup is never as easy as it should be.

At the end of a new team member’s first week on the job I would always check in to see how the week was going and to get some feedback on what could have gone better. If the team member was a developer, they would almost always be sure to respond that they wished getting the full development stack up and running on their laptop were easier. We’ve considered and/or tried the gamut of approaches here including:

  • wiki checklists
  • virtualization
  • local docker/container solutions
  • per service setup documentation
  • pair programming the setup process
  • etc

Despite everyone’s best intentions, the holy grail of handing a new team member a laptop and having them up and running and ready to develop quickly remains elusive. In a service oriented architecture, with many clients and various server technology, it’s a never ending fight to keep the full local setup instructions up to date and accessible. Install and update Xcode and Android studio. Run ElasticSearch, MongoDB, Redis. Get IntelliJ and install all JVM dependencies. Use RVM and install the right versions of Ruby and all gems. Get your python environment setup. Access and load the right seed data and keep your fingers crossed that it’s up to date. Get your AWS credentials configured correctly to access cloud services. The list goes on.

I don’t know what the right solution here is absent someone’s singular responsibility to ensure that there’s a one-click setup process that works and is up to date for local development the same way we have our various ops processes set up for production, but there’s probably an opportunity for a startup to come along and dramatically simplify the game here. Even though tons of proposed solutions exist, I’ve rarely seen anyone mention the ease of local environment setup as a positive at their company.

2. Getting all your data queryable through one interface — accessible to developers, analysts, and executives — requires a lot of work.

Having run engineering at two data-oriented companies, I’ve participated in the building of a number of data pipelines, and used a number of data warehousing solutions. There are tons of companies that assist in building and managing ETL processes, that create reporting dashboards and query interfaces for your data, and even abstract away the process of collecting and querying metrics and instead just provide you with a high level executive view of what they think you need to know. Yet somehow, the fact remains that if your data is spread across different internal servers and external services, then you generally end up having to rebuild the same data pipeline from scratch, and you still end up with an output that’s hardly accessible to the business leader that needs to use the data to make decisions.

Imagine this scenario: you have your local application data stored in Postgres or MongoDB. You have web analytics in Google Analytics. You have user interaction data stored in Mixpanel. You have logs and archived data in S3 and Redshift. Some of this data can be queried in real time, but some computations require batch processing. You’ve built ETL jobs to import everything to redshift. You’re using Spark or Hadoop to run batch computations. And you’re using an off the shelf visualization tool to create dashboards and charts and expose a custom query interface.

None of the above sound egregious or like too much work individually, but what I’ve seen is that every company requires a data team in order to set this up, maintain it, and iterate on it. And in the end no one finds it as easy to use or as impactful as they imagine it will be. Magic business insights don’t just fly out of the plumbing. You still need to do real work and analysis to get the answers you need.

Of course this work is worth doing, but it’s such a shame that every single company has to do all the work of building this relatively similar pipeline from scratch. There’s probably an opportunity for some shared leverage here.

3. Log management solutions still make it difficult for developers to answer the question “what happened?”

I won’t harp on this, as there really are a number of solutions and strategies out there to address log management, but I’ll just say that most approaches that we’ve tried that offer promise end up not really delivering the intended impact. Splunk is expensive and requires a lot of maintenance and discipline around log formatting across various services and technologies. The ELK stack (Elasticsearch, Logstash, and Kibana) is a great open alternative, however without discipline and maintenance the data ends up not being highly queryable due to format inconsistencies or operational issues on the logging side or storage/serving side. Sometimes developers tailing/greping log files are more useful than the promise of querying structured log data across all services, and as such setting up simple generic machine agents that rotate and archive logs to S3 for later processing can actually be the simplest approach.

4. DevOps knowledge remains siloed.

In many teams and organizations, everyone has a little piece of DevOps understanding in order to operate their own services, but very few people end up having the understanding of the infrastructural components that sit below the individual services. The web focused team member can deploy and debug web related issues, and the data engineer can set up replication on the databased, but not many people understand the healthchecks, monitoring infrastructure, and container lifecycle tools enough to make improvements or changes to the company’s DevOps workflow. This is a common challenge I’ve heard from a lot of other engineering managers.

I want to emphasize that any of the above issues could certainly be addressed by prioritizing them within the engineering organization’s culture and resource allocation. If you want your startup to pride itself on killer DevOps processes, where logs manage themselves, all metrics are instantly accessible and visible, and laptops come set up ready to build, you can certainly accomplish this. But the amount of effort and resource required to nail this in a pre-product-market-fit tech startup will come at a cost. Outsourcing non-core IP functionality is advisable until you’ve found that fit, and as mentioned, while a number of tools exist to tackle each of these areas, I still rarely see or hear of a company who’s thrilled with their solution to many of these issues. I think there’s opportunity to create some services that ease the pain here.

Let me know if you disagree or have recommendations @petkanics.

Written by

Building live streaming on the blockchain at Livepeer. Previously Founder, VP Eng at Wildcard and Hyperpublic (acquired by Groupon).

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store