How we collect data [Part 2]

6th installment of Milton's Q&A, this time diving deeper into data collection through our Milton Argos Collection Engine (MACe)

In a previous Milton Q&A we talked about how we collect data and introduced you to the MACeBox, our Milton Argos Collection Engine system. The MACe really is the heart of the collection process. We also gave many examples of log types we collect, but we left it all at a very high level. Today, let’s dive a little deeper to MACe and describe what is possible.  

The biggest take away, in case you are already tired of reading this (TL:DR), is MACe is extremely flexible and can be adapted to collect nearly any type of log from any type of system.  We draw a line at actions like using a webcam to capture a screen across the room and OCR the image, but that’s about it. That said, let's describe a few common options for collection to get you thinking how we can gather your data. 

Our most common and preferred method of capturing data is via syslog. What is a syslog? Great question! A syslog is a standard network-based protocol that allows all the devices on your network to send text logs to a central collection point. Most network appliances and many software solutions are already capable of submitting logs via a syslog connection across the network. The MACeBox listens for syslog messages on both TCP and UDP, captures the data and also tracks where messages are coming from. This is why it’s our favorite way of collecting data - an extremely reliable one-stop-shop of sorts. 

Now, for systems that do not inherently incorporate syslog capabilities, think Active Directory or IIS, an agent can be used to collect and transmit via syslog. NXLog is one of our most commonly used solutions. But what if your corporate policies prevent the use of an agent on the DC? In that case, we have a few other options. Remember when I said that MACe systems are flexible? We can utilize Powershell or WMI to query the logs remotely, and in fact, we can script many different collection methods, such as an http listener for custom applications. We can also map to an SMB share and use an agent on the MACe system to scan a folder for logs and transmit changes to files as they come in. Really the sky's the limit….or a webcam and OCR.  

So far we’ve mostly discussed on premise sources, but what about cloud? Cloud is just as easy. In fact, when collecting data from the cloud, we can completely bypass the MACeBox at your location and pull it directly to MACeHome. MACeHome is the ingestion point to our private cloud, Argos. When we need to collect from platforms like Azure, AWS, GCP, Crowdstrike, or Carbon Black, we set up direct ingestion which helps lighten the load on your internet bandwidth.  

Ok, so for those of us who are more visual learners, here is a quick diagram of how the data flows from your systems (through syslog) to MACeBox to MACeHome and ultimately to our Threat Hunters who are keeping watch over your system.

On prem collection looks something like this:
onprem_flow.png

And then we add in the cloud component…

Screen Shot 2021-04-30 at 12.57.25 PM.png

And when you put it all together, the end result looks like this!

onprem_flow2.png

And there you have it. As you can see, we really do collect a ton of data from your systems so that we can spot any outliers that we need to immediately investigate. 

If you have any additional questions about anything we covered today, feel free to drop me an email. This is absolutely my passion and I’m more than happy to help explain further or walk you through what setting up this collection process would look like. 

Until next time!