Not all job opportunities in data science are created equal. I’ve seen hundreds of job descriptions for data science positions that are radically dissimilar. This occurs for a variety of reasons, not the least of which is that many organizations are unaware of what a data scientist does or the benefits of successful data science application.
Certain businesses advertise for a data scientist when, in fact, they require a data analyst. While some believe that a data scientist is a data architect or data engineer, others believe that a data scientist is neither. I was hired by a business to create a database data model! “These are not the data models you seek.” My response: (With due acknowledgement to Obi-wan Kenobi)” — I do not profess to be an architect of data.
Therefore, if you are in the position of interviewing, you will need a keen eye for genuine data science opportunities as opposed to openings that state, “someone told us we needed a data scientist, so we slapped together this description.” Four points to consider when conducting phone or in-person interviews.
Question 1: Why do you believe you require a data scientist?
Although this is not the most critical question, it should be the first. Whether speaking with the CEO, CFO, CTO, or hiring manager, you must truly understand why they believe they require a data scientist. While it is likely that they require some of the skills that a data scientist brings to the table, such as data manipulation and visualization, they frequently conflate or confound various data skills into one colossal term: data scientist.
Typically, data scientists do not design relational database applications or data warehouses. Additionally, data scientists are not software developers in the broadest sense. A data scientist, on the other hand, will be required to query data contained within data warehouses, data repositories, or data silos. Frequently, they will also require data staging from those various silos in order to conduct analysis.
Therefore, if the hiring team requires someone to design and architect a data strategy, a data scientist is unlikely to be required. Until now, that is. They are not looking for a data scientist if they are looking for someone to build a relational database application, although some may have that skill set. If an organization is looking for someone who can take data from multiple data sources, stage it, and create an associative data model for display and analysis within a business intelligence application, they are not looking for a data scientist. A business intelligence developer possesses this set of skills. And, while many data scientists possess this skill set, it is not the core set of skills and functions that a data scientist requires or will use to perform their work.
Question 2: What business problems do you want the data scientist to address?
This is a critical question because it relates to the first. Chances are, the organization is well aware of its business challenges, but is unsure of how a data scientist can assist. Ascertain that they can articulate clearly how these difficulties relate to data science. Most likely, they require only a subset of the skills that a data scientist possesses, not the “complete package.” They may have been informed by someone that they require the services of a data scientist. Alternatively, they may have read it in a magazine. They should, however, be able to articulate why they require a data scientist rather than, say, a data analyst.
I’ve been hired to create visualizations, which is not a bad thing in and of itself—I am capable of creating visualizations. I am capable of using R, Qlik Sense, Tableau, and even Excel, but so is a data analyst, a business analyst, or a recent college graduate. And they do not require the compensation I do.
If an organization does not intend to perform inferential, predictive, or prescriptive analytics in the future, then it does not require a data scientist.
Question 3: Are you in possession of a data warehouse?
Numerous organizations lack a firm grasp on their data. And that data immaturity can work against a data scientist. They may be receiving a flood of data from a variety of different sources, all of which is stored in various data silos or hundreds or thousands of Excel spreadsheets and across an uncountable number of MS Access databases. A portion of the data may be stored on-premise in a relational database or in a cloud-based application. However, just because a business has a lot of data, even if it is big data, does not mean they have the data maturity necessary to support a data scientist’s work.
If they lack a data warehouse or at the very least a strategy for organizing all of this disparate data for consumption by the various organizational departments, a data scientist will have a difficult time.
Can you imagine telling a stakeholder that you are capable of performing the work but that it will take a year to collect and organize the data? Even before you’ve had a chance to analyze it? That is something no business would tolerate, and yet it is possible that it will take that long. At some point, a data scientist will require the assistance of a subject matter expert (SME), possibly multiple times.
I’ve been in a position where I was prohibited from communicating with SMEs. They were “overwhelmed.” I was unable to complete my work. Having data is not synonymous with having the correct data! I once waited over a year for a view enhancement to a 15-minute sequel query. The SMEs were overworked, and I was denied access to the view to make the necessary changes. My work slowed to a halt. That is not the type of environment in which you want to work.
Question 4: When do you expect to see results?
This is perhaps the most critical question. When do they anticipate receiving results? As with many other aspects of life, data science takes time. Even if all the stars align perfectly, a data scientist will still be required to perform significant data manipulation, analysis, and visualization. Following that, some work validation with stakeholders is necessary before returning to perform additional data manipulation, visualization, and analysis. This is a gradual process, not an instantaneous occurrence. Additionally, it is not glamorous. If you’re expecting data science to be glamorous, you’re mistaken.
Therefore, if the organization for which you wish to work expects results in 30, 60, or 90 days, it is very likely that the expectation is unreasonable and that you will be unable to live up to it. In other words, you will create an environment conducive to failure.
Therefore, ask pertinent questions and receive genuine responses. Take notes and give them a nickel tour of your perception of how a data science project should look, comparing it to what they are telling you. Then, and only then, will you be able to identify the optimal culture for performing effective data science work.