Automatically import and process Excel files from emails into the ERP

Just-in-time data exchange across system boundaries is a decisive competitive factor for transport logistics companies. This is especially true for track & trace information: The more complex the supply relationships, the more flexible the transportation network, the more complex it is to guarantee complete track & trace data. The "last 10 to 20 percent" in particular is a major challenge, as it usually does not arrive via an interface, but finds its way into the ERP or TMS as an Excel or text file via email. Manual entries are often made in the department: copy & paste sends its regards. The digitization and automation of this tedious activity is usually not achieved. There is another way: modern cloud platforms work seamlessly with specialized SaaS services to convert the required information into bite-sized - or rather machine-readable - form. Using an IBM i core application as an example, let's take a look at how emails can be turned into fully-fledged T&T information using Excel sheets - without any voodoo magic.

 

 

Every day, various Excel lists with wagon numbers, container numbers, additional information and arrival times arrive by e-mail, which the operations staff grudgingly transfer to the IBM i-based transport management system - with copy & paste; once Excel, then green screen, then Excel again, and so on. At the same time, the majority of such arrival notifications are automatically imported from other sources via an EDIfact interface. The idea of somehow automatically reading out the unpopular but necessary e-mails and automatically importing the required information into the IBM i system via this existing EDIfact interface seems obvious. Surely this must be possible? In times of volatile and heavily overloaded logistics chains, this would be a welcome way to relieve the creative and productive employee; instead, he can take on challenging tasks that require his skills, e.g. forward-looking planning or even better customer service. The computer can also copy and paste itself - without getting bored or frustrated.

 

 

So far, so good. The tricky thing about this task is the structure of the Excel data and the proprietary Excel format itself. The sheets and data contained are structured reasonably homogeneously, i.e. in our case a maximum of four containers can be transported on each wagon, whose core data such as container number, order number, size and total weight are noted in columns. However, a closer look reveals a variability of spaces in the container numbers, different meanings of the order numbers in the "Remarks" column depending on the client, as well as extra lines and notes at the end from time to time, in which the dispatchers of another service provider have entered various information that is irrelevant to us. In addition, the Track & Trace status messages should only be sent when their actual time of arrival (ATA) is reached - even if they arrive early. Unfortunately, it is not known whether their unloading could also take place prematurely; the information is still missing. Last but not least, the Excel structure cannot be changed, as the Excel file has been sent to several transport companies in exactly the same way for years. We don't know whether a change would cause problems for other customers. In a nutshell:

Automation must take all eventualities into account

What exactly needs to be done now - and above all how? The following individual tasks arise:

 

  1. Be informed when a new, relevant (!) Excel file arrives by e-mail
  2. Convert Excel file (XLSX) into a processable format (CSV, JSON or similar)
  3. Select relevant Excel worksheets and discard others
  4. Clean T&T data of unwanted spaces and similar, i.e. normalize
  5. Convert T&T data into individual EDIfact data records for the IBM i (Fixed Record Format)
  6. Save EDIfact data records in individual text files (this is how the IBM i ERP wants it)
  7. Transfer text files to the IBM i ERP at the appropriate ATA time

 

This all sounds a little more complex: Excel is a proprietary format, the variability in the Excel workbooks and the data requires logical abstraction in order to select and standardize what is relevant. Old data formats such as EDIfact require the exact positioning of field contents in a character string. Finally, everything has to be timed correctly so that a customer does not receive an arrival notification before the container is ready for collection. That would be fatal.

Ad arma! To arms! - as the Romans would say. The "weapon of choice" is the cloud-based low-code automation and integration platform Workato. The platform from the Californian unicorn of the same name promises to solve almost all automation tasks in which data has to be exchanged and adapted between numerous systems. The individual work steps are interlinked in such a way that human interaction is no longer necessary where it is not explicitly required. We have talked about the basic principles of Workato elsewhere elsewhere.

The arrival of an email in the Office 365 mailbox - more precisely the Exchange Online mailbox of the Operations department - serves as the trigger for an automation run (job), provided it is a relevant email; this is ensured by checking the sender and the attachments. The recipe, as Workato calls its automation processes, can either be developed specifically for one case (as here) or branch out into modules and variants (e.g. using callable recipes or IF branches). The first individual task would thus be fulfilled.

 

 

 

The Excel workbook must then be converted into a processable format. Only the Excel worksheet that is relevant for sending the T&T information should be selected - in our case the current day or the last Saturday if a weekend has just ended. If the Excel workbook also contains worksheets with old data, as in our case, these must of course be ignored. Otherwise the recipients (customers) may end up with a duplicate mess.

For this purpose we use Cloudmersivea leading cloud service for a wide range of conversions. Its API specification states that it can separate XLSX workbooks and convert them to CSV. In turn, we can process CSV in Workato with minimal effort. Wonderful! That's exactly what we're looking for.

 

 

It would be tempting to immediately add an HTTP action to your recipe in Workato if it weren't for one little thing. Yes, the famous little thing was already known to Inspector Columbo and brought down so many (dis)honorable people. Here, this little thing is the requirement of the Cloudmersive API endpoint to receive the XLSX file as formData. This means multipart/form-data as MIME type, which the W3C explains as follows:

| The content type "multipart/form-data" should be used for submitting forms that contain files, non-ASCII data, and binary data.

So far, so good. However, if you want to set this content type in the Workato HTTP action, you will not find what you are looking for. Workato, however, offers an elegant solution via its Connector SDK. As a result, we create a connector, in this case for the Cloudmersive API, in which we can specify triggers, actions, configurable fields, etc. In our case, we implement an action to convert XLSX workbooks to CSV.

 

 

The Connector SDK is a purely web-based development environment for individual connectors; no extra software needs to be installed. A connector follows a standard structure including documented sample code and offers (almost) all the freedoms of Ruby, the preferred programming language in the SDK. JavaScript friends can alternatively integrate their code via a JavaScript action, but would then miss out on the advantages of the Connector SDK, which enables the creation of configurable (!) triggers, actions, connections, webhooks, configuration fields etc. in a connector. Since Ruby is intuitive to learn and powerful at the same time, we like to stick with the Connector SDK. It is also the gateway to the Connector Community Library, a kind of app store for connectors, through which you can (but don't have to) easily share your own connectors.

Using a Ruby method, we can easily transfer the XLSX to Cloudmersive in the Connector SDK and immediately receive several links to download all converted worksheets in CSV format. Individual task 2 would thus be completed.

 

 

Now we need to identify the current CSV list with the ATA data and make it machine-readable; in Workato, this means parsing the CSV file and making it available as a JSON object for all further steps. Ruby helps us here again, this time with a formula for populating three variables with data from the current CSV sheet. All other CSV sheets remain unconsidered, just as individual task 3 requires.

 

 

After cleaning up unwanted spaces while we create a separate T&T data record for each container and cache it in an internal list, all individual processing tasks including number 4 are completed for the time being.

 

 

Now it gets a little tricky again: Workato carries the processed data internally as Ruby objects. Sending these to another service as JSON would be easy, but is not sufficient in this case. The IBM i-based ERP requires the data records in a very specific EDIfact format, which also deviates slightly from the standard. We overcome this challenge in the 5th individual task by using Ruby string interpolations to convert the content of the object to be saved into the desired fixed record format.

 

 

Finally, we store each EDIfact record as a separate text file in an Amazon Web Services S3 bucket in the data protection region of our trust, from where a second Workato Recipe retrieves the generated EDIfact files to import them into the IBM i core system and provide customers with the required track & trace information at the right time. Et voilà: Individual task 6 completed and 7 prepared. We will discuss the seventh step in a separate article, in which we will deal with individual triggers, cron jobs and trigger conditions.

 

What have we learned? This specific example shows us three features of modern automation and workflow integration:

  1. Automation is an interplay of several distributed services and applications (here Workato, Office 365, Exchange Online, Cloudmersive, AWS, IBM); there is no "one platform" that does everything independently. The tasks are too diverse for that.
  2. Low-code can be wonderfully complemented by high-code, where custom-fit algorithms or specialized libraries are needed to master even complex requirements in heterogeneous environments.
  3. The Connector SDK offers every interested "maker" the opportunity to make regularly recurring actions accessible to every Workato user and to make their own offer available to their customers via the marketplace. It also opens up the wide world of Ruby and its functionalities.

There are no limits to your own creativity. Coupled with the underlying standard capabilities in terms of security, scalability, performance, maintainability, versioning, CI/CD and job control, nobody has to wait any longer to implement automation and integration tasks in all facets. Workato is available as a cloud-based SaaS "at the touch of a button" and can be used immediately - whether in state-of-the-art microservice architectures, in classic host systems or as a supplement to the existing Enterprise Service Bus. The charm is that Workato connects all services and worlds without being dogmatic. And that's what matters: Get your job done!

About Business Automatica GmbH:

Business Automatica reduces process costs by automating manual activities, increases the quality of data exchange in complex system architectures and connects on-premise systems with modern cloud and SaaS architectures.