As you must have noticed that Development of Machine Learning / AI projects largely aligned to specific groups inside the organization. Those who have specialized skills handle the development, testing and operationalization of the components. In deed this way of managing AI/ML projects offer benefits to the organization due to its specialized nature, it also poses barrier for the rapid transitioning of the organization to develop other innovative developments.

code coder coding computer
Photo by Pixabay on

Those specialized team needs to juggle between the operationalization of the data science projects as well as development of new models, identification new developments. When such operational activity is assigned outside development team it offers lot more flexibility and streamlining of the processes to offer long term sustainability of the processes. It might seem very difficult to separate development and operationalize of AI / Machine Learning models, but the very seggregation offer lot more benefit. In this blog I would like to give an overview of various high level datascience Run and Maintain activity which can be given to dedicated team to offer better management of datascience operationalization.

  • Cleansing of input data: This is indeed one of the critical components of any machine learning or data science project. You needed to cleanse the data for the missing data and remove duplicates or identification of outliers and managing them. It usually is a domain expertise to cleanse the data. However off loading such cleasing activity outside the project team to team of developers will help as the cleasning activitivy are managed using a set of commands which needed to transform the data frame and identification of “NA” values or if it is based on Azure framework it is a set of menu driven activity. Whatever may be the nature of the tool being used extending this to set of Technical persons under R&M will help organization in freeing up the bandwidth as it is one of the steps which takes majority of the persons expertise.
  • Extract transform and Load / Extract Load and Transform activities are used to load and transform the data. Using various sql or database querries data is identified and extracted and transformed into a loadable format. The data is not ideally transformed rather it is extracted and loaded to tools and then applied the transform techniques. If it needs analysis of data using windows functional for example as is the case with Greenplum database, it is easier done outside and analyzed for the information. Day to day analysis of such activities will help organizational better managing their expertise.
  • Model accuracy management – As it is evident models tends to degrade over time due to one reason or other. Continous analysis of model accuracy and identification of critical point for model retraining needs critical monitoring. Using the set of Operational team it is possible to manage these activities offering better time for development of advanced models
  • Exception and error handling – Models work on set of predefined model input parameters and input data. When there is a mismatch of data type or data format it fails and sometimes a good coder inputs clear error handling messages to manage such issues. These day to day identification of exceptions to models can be managed by operational team
  • System maintenance and restarting of models – Usually models are deployed as a set of web services and when those are run they might get into fail model and needed to be deployed again for getting results. Use of dedicated operational team can help in monitoring the set of services which are active and running and starting the services in cases of failure.
  • IoT device management – If the input data is feeded using IoT based devices. The regular monitoring of data output devices and their accurcay will help in fixing the communication issues or power consumption or power or data backup requirements etc can better managed once those are operationalized.

Besides these areas there are many other areas like Big Data platforms which require regular maitenance of systems with start/ stop of services and set of dedicted pages for maintenance. Such systems dedicated maintenance to an operational team will benefit organizations in better managint their talent by means of developing a set of operational procedures; SOPs and other set of maintenance activities. I understand it would be difficult ot immediately transition the activities to operational team, but a phase approach will help organizations a lot.

Thanks – Jak

This Post Has One Comment

  1. Great article, Jai. I was working with a plant yesterday to get some basic input data for a AI project and found it needed to be formatted, cleaned before it made any sense. Also, there are severe resource crunch in the AI industry as very few designers are there to design the flow

Leave a Reply

Close Menu
%d bloggers like this: