Senior Lecturer in Governance of Advanced and Emerging Technologies
University of Derby
As our organisations move towards the use of advanced analytics and artificial intelligence / machine learning for operational systems there is a very strong imperative to reconsider how we design, build and test such systems prior to implementation.
We have always be vaguely aware that our operational systems data are not particularly clean. We now realise that the problem of dirty data has some degree of importance when we are told that about 80% of the budget of any analytics or AI / ML project is required for data curation and data cleansing.
Now that we are training and attempting to use learning systems that find the patterns in the training data, we need to focus not just on the cleanliness of the data but also on the representativeness and presence of bias in this training data. It is also becoming apparent that even the choices of the analytics and the construction of the learning systems depends on the biases and world view of the software creators.
We are no longer in the almost trivially simple world of designing and testing traditional algorithmic systems, so familiar to us all.
How should we design, build and test these new systems? What are the imperatives?