Knowledge Required: Familiarity with Sentinel

Tools required: Microsoft Sentinel

When an attacker gains foothold in your network, which hopefully they never do, it is common they try to immediately see what they have access to. One technique is credential access where an attacker will bruteforce a system in order to find a successful username and password combination. This behavior is typically noisy and many security platforms have detections for this behavior but mass failed authentication doesn’t just mean a bruteforce is occurring and can sometimes occur due to misconfiguration. In order to reduce alert fatigue, what an analyst really wants to know is if an attacker has gained access to the system as a result of a bruteforce. Today, we’re going to explore how we can detect that behavior via SSH logs within the Sentinel platform.

Coming up with query logic

In this example, our detection logic will be determined by what we want to know to make an informed decision on a successful bruteforce:

  • We want to look back over a certain time period
  • We look for failed authentication attempts by user and computer
  • We then want to evaluate if users which have failed to authenticate have also successfully authenticated across the same computer
  • It’s important to present information in such a way that shows when a compromised user account is being used across multiple devices (later movement)

If you’ve already played with KQL, you should already be thinking that the summarize feature in KQL is perfect for this. If you’re more familiar with SQL, then you’d use group by in a similar way. If you’ve got further interests in summarize, Microsoft gives you some good examples of how to use it here and be may be worth reading it you don’t have Familiarity with the summarize feature.

Getting a list of successful SSH logins by user and computer

In the example, we have visibility of SSH activity from the Syslog table in Sentinel. To get successful logins, we can run the following KQL query:

let lookback=48h;
Syslog
| where TimeGenerated > ago(lookback)
| where Facility has "auth" and SyslogMessage contains "Accepted"
| extend Method = tostring(split(SyslogMessage," ")[1]) //deploy from 192.168.0.6 port 50800
| extend Username = tostring(split(SyslogMessage, " ")[3]) //["Accepted","password","for","ns","from","192.168.0.5","port","53416","ssh2"]
| summarize TotalSuccess=count() by Computer, Username

Note that we’ve used the string split method to extract the username and authenticated method out of the log and place it in a column called Username and Method

We can then generate a total number of successful logins for each user and the computer they signed into. That presents us with a clean table that can be seen below:

Computer Username TotalSuccess
nessus01-v2 megauser 64
ansible01 megauser 64
nessus01-v2 rickastley 32

Getting a list of failed SSH logins by user and computer

With the above logic, we repeat a similar query but instead tweak the logic to generate a count of failed logins by computer and user. We can do that with the following query:

let lookback=48h;
Syslog
| where TimeGenerated > ago(48h)
| where Facility has "auth" and SyslogMessage contains "Failed"
| extend Username = split(SyslogMessage, "user")[1] // Failed password for invalid user ns from 192.168.0.5 port 38546 ssh2
| extend Username = tostring(split(Username, " ")[1])
| summarize TotalFail=count() by Computer, Username

This presents the following table:

Computer Username TotalFail
ansible01 rickastley 12
nessus01-v2 rickastley 4
nessus01-v2 ${jndi 2
ansible01 ${jndi 2

Combining our results together

The union operator can be used to join the results of two queries into one table.

Let’s apply this:

let lookback=48h;
let SuccessfulLoginsThreshold = 2;
let FailedLoginsThreshold = 2;
//
// Get successful logins from SSH
//
Syslog
| where TimeGenerated > ago(lookback)
| where Facility has "auth" and SyslogMessage contains "Accepted"
| extend Method = tostring(split(SyslogMessage," ")[1]) //deploy from 192.168.0.6 port 50800
| extend Username = tostring(split(SyslogMessage, " ")[3]) //["Accepted","password","for","ns","from","192.168.0.5","port","53416","ssh2"]
| summarize TotalSuccess=count() by Computer, Username
//
// Join our table of failed SSH logins
//
| union withsource=SourceTable kind=outer (
    Syslog
    | where TimeGenerated > ago(lookback)
    | where Facility has "auth" and SyslogMessage contains "Failed"
    | extend Username = split(SyslogMessage, "user")[1] // Failed password for invalid user ns from 192.168.0.5 port 38546 ssh2
    | extend Username = tostring(split(Username, " ")[1])
    | summarize TotalFail=count() by Computer, Username
)

Table:

SourceTable Computer Username TotalSuccess TotalFail
union_arg1 ansible01 rickastley 12
union_arg1 nessus01-v2 rickastley 4
union_arg1 nessus01-v2 ${jndi 2
union_arg1 ansible01 ${jndi 2
union_arg0 nessus01-v2 megauser 64
union_arg0 ansible01 megauser 64
union_arg0 nessus01-v2 rickastley 32

Now we’re pretty much there with the data we need for our logic.

Note: at the top of query, we’ve added two variables for thresholds for a number of failed and successful logins to look for. This allows us to tune our KQL query and change the logic quickly to reduce false positives. The variable names are self explanatory

It’s also worth highlighting the use of the extend command which has consistently placed the username from failed and successful logins in a column named Username. This consistency is important for later but also makes our data easier to read.

Including our thresholds in the query

With all the data in the table, we now need to sum the number of failures by user and computer and also the number of successes by user and computer. Again, this is the perfect use for the summarize command. While we’re at it, we’ll also include a WHERE statement that takes into account the new thresholds defined the the top of our query. This can be done by adding 2 lines to filter our latest table of results:

| summarize SuccessfulWithinTimePeriod=sum(TotalSuccess), FailedWithininTimePeriod=sum(TotalFail) by Username, Computer
| where SuccessfulWithinTimePeriod > SuccessfulLoginsThreshold and FailedWithininTimePeriod > FailedLoginsThreshold

With data from the table above, the sum operator sums the number of failed and successful logins by computer and then the WHERE statement filters data to our tuning thresholds. The query then summarizes this by computer and user. This presents a clear list of users where a successful bruteforce attack may have occurred and on which computers:

Username Computer SuccessfulWithinTimePeriod FailedWithininTimePeriod
rickastley nessus01-v2 32 4

It’s about time somebody investigates what that ‘rickastley’ user is up to, as it doesn’t look like he’s giving up any time soon!

Extending this further

The usage of our final summarize command makes this nice and easy to enrich with other sources and detect cases of further bruteforce attempts, gaining greater visibility. We simply need to union further tables we’re searching across into our query. The only requirements are that each query should by summarized and produce the column names:

  • TotalFail / TotalSuccess (depending on if we’re searching for successful or failed authentication)
  • A column called Username
  • The Computer column

If you want a challenge, look how you could extend this to Windows authentication logging by looking at the SecurityEvent table and adding a query that searches for event IDs 4624 (success) and 4625 (fail).

Once you’ve got your query nailed and you’ve tuned out false positives, you can set up an analytics rule.

Finished Query:

let lookback=48h;
let SuccessfulLoginsThreshold = 2;
let FailedLoginsThreshold = 2;
//
// Get successful logins from SSH
//
Syslog
| where TimeGenerated > ago(lookback)
| where Facility has "auth" and SyslogMessage contains "Accepted"
| extend Method = tostring(split(SyslogMessage," ")[1]) //deploy from 192.168.0.6 port 50800
| extend Username = tostring(split(SyslogMessage, " ")[3]) //["Accepted","password","for","ns","from","192.168.0.5","port","53416","ssh2"]
| summarize TotalSuccess=count() by Computer, Username
//
// Join our table of failed SSH logins
//
| union withsource=SourceTable kind=outer (
    Syslog
    | where TimeGenerated > ago(lookback)
    | where Facility has "auth" and SyslogMessage contains "Failed"
    | extend Username = split(SyslogMessage, "user")[1] // Failed password for invalid user ns from 192.168.0.5 port 38546 ssh2
    | extend Username = tostring(split(Username, " ")[1])
    | summarize TotalFail=count() by Computer, Username
)
| summarize SuccessfulWithinTimePeriod=sum(TotalSuccess), FailedWithininTimePeriod=sum(TotalFail) by Username, Computer
| where SuccessfulWithinTimePeriod > SuccessfulLoginsThreshold and FailedWithininTimePeriod > FailedLoginsThreshold

EOF break