The final stage of doing data science/machine learning/data mining/predictive analytics is to use the results, which generally involves some form of communication to one or more types of audiences. This, I will term “data artistry”. (This is not necessarily a common term used, but it does have some precedence in specific contexts)
Data artistry can take many forms. If the result is one that is helping the data scientist, and other have a better understanding of the data and what it means, then data artistry is often a data visualization of some sort. Traditionally this has been in the form of charts and graphs, but there are many other data visualizations that have become popular for different contexts. For example, tag clouds (aka word clouds, or wordles) have become quite popular for presenting textual data.
Data artistry can also take the form of how to present something to a user in a way that will hopefully result in a type of action. For example, when Netflix or Amazon presents a recommended item, they are working to have the end user get that recommended item. This form of artistry is more subtle, as it is often more about the layout and words used to present the information to the user. Further, this data artistry may have its own data science methods used, such as A/B testing to see which methods of presentation result in the most sales.
In the end, all of these stages are often iterative, where depending upon what is learned along the way, there will be revisions at an earlier stage to work to improve the overall outcome. Also, not all stages may be done by the same data scientist. A team may exist, where the Data Durfer is someone who doesn’t necessarily have a lot computer science/math expertise, but knows about the domain. The Data Wrangler may be someone paid just to do data cleanup, and may even be someone who has been involved in the past with data entry. The Data Miner will be the person who has the computer science and statistical background. And the Data Artist may be someone from the marketing department, or who has a creative/artistic background more than a computer programming background. So whether there is one person or several doing these roles, each of the roles that have been presented are important to the data science process.