Date: 31 Dec 2020

Like most of you, when I tried to explore the career in data field, I want to be a data analyst. It is the sexiest job in 21st Century (according to Harvard Business Review. [an article titled: Data Scientist: The Sexiest Job of the 21st Century]). At that time, I didn’t know the existence of other data professionals that are also critical to the success of generating value from data.

We may have different definitions on Data Analysis. For me, data analysis means generating information out of data. Information is something with value and insights to the company, while data is something raw. Typical tasks for a data analyst are creating and monitoring KPIs (e.g. dashboard), identifying KPIs/ metrics to be monitored, analyze (sometime we say play around with) raw data to see if there is anything valuable to the company/ your team. If someone is more from a statistic background, he may also build prediction models.

If I joined a bigger company instead of a startup, I may not recognize the importance of data engineering and data governance. I was the first data analyst hired in my current company. Clearly, they didn’t know how to set up a data team (And me too) around 1.5 years ago. They might think hiring a data analyst will help generating insights from data immediately. I think this is quite common in many companies, no matter big or small, of which they just started to build a data team. But the fact is data does not just exist. And there is no guarantee that existing data is of good and usable quality.

If data is an asset to a company, it needs to be managed. Data governance is a framework to manage the data of a company. It identifies data points to be captured in the whole business process, and defines and ensure data quality. Only when the asset is managed properly, can data analyst have enough quality data to do their work.

Data engineering is responsible for capturing and transforming data. After defining the governance framework, data engineer should work on capturing the data and validate the data quality according to the framework before storing the data in database. After data is stored properly, transformation is needed to combine data from different sources and put data in a user friendly manner such that data analyst can do their work efficiently.

Hopefully, through this article, you have a better understand on the unknown hero in data field.

Btw, Happy New Year!

Leave a Reply