Skip to content

Releases: microsoft/msticpy

Multi-dimensional plots for outliers

06 Dec 21:42
337632d
Compare
Choose a tag to compare

Highlights

Multi-dimensional plots for outliers by @Tatsuya-hasegawa

The outliers module has lived in MSTICPy for a long time but been some neglected
@Tatsuya-hasegawa (hacker-T) has contributed some cool visualizations to
better interpret the data.
Many thanks!!!

import numpy as np
from msticpy.analysis.outliers import identify_outliers,plot_outlier_results

n_dimension = 7

# create random numeric samples
data = np.random.rand(100, n_dimension)

# calc outliers by Isolation Forest algorism
clf, X_outliers, y_pred_outliers = identify_outliers(data, data, contamination=0.1, max_features=0.4)
    
feature_columns = [f'feature{i}' for i in range(1, n_dimension+1)]

plot_outlier_results(
    clf, 
    data,
    data, 
    X_outliers, 
    feature_columns=feature_columns, 
    plt_title="MSTICPY Isolation Forest Anomaly Detection for Multi Dimension Features"
)

image

Improved code/docs for federated authentication for M365D/M356 Graph providers - @ryan-detect-dot-dev

Although using federated auth (rather than client secret) has been possible for a while, the documentation
for how to use this was in the MSTICPy docs. Thanks to Ryan we now have this (along with cleaned up code
for the Defender* data providers.
(although Ryan is listed as a new contributor below - he has made several previous contributions under
a different GitHub identity)

Rigorous Type Annotation work started by @FlorianBracq earlier this year continues.

This helps to make the code more robust and clearer to read and use. This is thankless work but my
huge thanks go out to @FlorianBracq for this!

Other fixes

Some other important fixes to CyberReason driver and Azure Monitor/MS Sentinel driver are also included

What's Changed

New Contributors

Full Changelog: v2.14.0...v2.15.0

User Session Management, MaxMind Geolit fix, Extract nested dicts from Pandas

21 Oct 19:03
deab7a5
Compare
Choose a tag to compare

User Session Configuration

Do you always have one or more data providers or other components that you need to load for every notebook you create?
I do, and got a bit fed up with typing the same lines of code over and over again.

User session configuration lets you specify which providers are loaded, whether or not to connect and which parameters
to supply at load and connect time. You put all of this into a straightforward YAML file and load it using the following:

import msticpy as mp   # you likely will already be doing this
mp.init_notebook()     # and this

mp.load_user_session("my_config.yaml")   # if you have a "mp_user_session.yaml" in the current directory
                                         # you can skip the parameter

This example shows the structure of the YAML:

QueryProviders:
  qry_prov_sent:
    DataEnvironment: MSSentinel
    InitArgs:
      debug: True
    Connect: True
    ConnectArgs:
      workspace: MySoc
      auth_methods: ['cli', 'device_code']
  qry_prov_md:
    DataEnvironment: M365D
Components:
   mssentinel:
      Module: msticpy.context.azure
      Class: MicrosoftSentinel
      InitArgs:
      Connect: True
      ConnectArgs:
          workspace: MySoc
          auth_methods: ['cli', 'device_code']

The providers/components created (e.g. qry_prov_sent in this example)
are published back to your notebook Python namespace, so you'll see
these available as variables ready to use.

This configuration file is equivalent to the following code:

qry_prov_sent = mp.QueryProvider("MSSentinel")
qry_prov_sent.connect(workspace="MySoc", auth_methods=['cli', 'device_code'])
qry_prov_md = mp.QueryProvider("M365D")

from msticpy.context.azure import MicrosoftSentinel
mssentinel = MicrosoftSentinel()
mssentinel.connect(workspace="MySoc", auth_methods=['cli', 'device_code'])

Not a huge saving, on the face of it, but if you create a lot of notebooks or want to use
msticpy in an automation scenario, it can be very helpful.
Include a verbose=True parameter to load_user_session to see more detailed logging of what is going on.
See the full documentation here

Maxmind GeoIPLite fix

Sometime recently (not too sure when) Maxmind changed their download procedure to use
a different URL and authentication mechanism. This was causing auto-update to fail. To use
the new mechanism you need to get your Maxmind User Account ID (login and look at your
account properties) and add that to your msticpyconfig.yaml as shown below.

OtherProviders:
  GeoIPLite:
    Args:
      AccountID: "1234567"
      AuthKey:
        EnvironmentVar: "MAXMIND_AUTH"
      DBFolder: "~/.msticpy"
    Provider: "GeoLiteLookup"

Extract nested dictionaries from pandas column to multiple rows/columns

@pioneerHitesh has added this as a new method in the mp_pivot pandas extension:

data_df.mp_pivot.dict_to_dataframe(col="my_nested_column")

It returns a dataframe with the column recursively expanded:

  • lists become new rows
  • dictionaries become new columns

So a column with the following structure:

NCol
0 {'A': ['A1', 'A2', 'A3'], 'B': {'B1': 'B1-1', 'B2': 'B2-1'}}
1 {'A': ['A3', 'A4', 'A5'], 'B': {'B3': 'B3-1', 'B4': 'B4-1'}}
my_df = src_df.mp_pivot.dict_to_dataframe(col="NCol")
my_df

Would be unpacked to:

A.0 A.1 A.2 B.B1 B.B2 B.B3 B.B4
0 A1 A2 A3 B1-1 B2-1 nan nan
1 A3 A4 A5 nan nan B3-1 B4-1

What's Changed

  • Authentication module unit test by @ianhelle in #800
  • Use sessions config and GeoIP download failure by @ianhelle in #801
  • Added Inbuilt function to extract nested JSON by @pioneerHitesh in #798
  • Add max retry parameter to the execution prevent HTTP 429 by @vx3r in #802

New Contributors

Full Changelog: v2.13.1...v2.14.0

Hotfix for authentication error

19 Sep 18:26
c01b629
Compare
Choose a tag to compare

We introduced a bug in azure_auth_core that caused Azure authentication to fail.

What's Changed

Full Changelog: v2.13.0...v2.13.1

AI documentation assistant, BinaryEdge TI provider and other misc fixes

13 Sep 00:31
b5e052a
Compare
Choose a tag to compare

We've been quietly doing some work to introduce LLM/GPT/AI capabilities into msticpy.
@EileenG02 has helped us in that direction by building a document Q&A agent using Autogen.

You can try it out in a notebook using the following:

Load the magic extension

%load_ext msticpy.aiagents.mp_docs_rag_magic

Ask a question in a separate cell using the %%ask cell magic

%%ask
What are the three things that I need to connect to Azure Query Provider?

Awesome work @EileenG02!

There's also a new TI provider for BinaryEdge courtesy of @petebryan.

Alongside this there have been quite a few contributions to fix and improve things like:

  • Splunk improvements (thanks @Tatsuya-hasegawa)
  • Fixes for Sentinel provider get_alert_rules to use updated API (thanks @BWC-TomW)
  • A massive amount of type annotation work and fixes to context/TI providers by @FlorianBracq
  • Miscellaneous fixes to things like Sentinel TI provider, MSSentinel tidy-up to more consistently handle parameters,
    correct use of the term CountryOrRegionName from CountryName in geolocation contexts.

The gory details of the PRs follow:

What's Changed

New Contributors

Full Changelog: v2.12.0...v2.13.0

Splunk and Sentinel Updates

10 May 23:25
07a2f0d
Compare
Choose a tag to compare

Sentinel updates

WorkspaceConfig and Sentinel QueryProvider (azure_monito_driver) have had a few updates:

  • handle both old (Kqlmagic) and standard connection string formats in WorkspaceConfig
  • removing a lot of legacy code from WorkspaceConfig
  • Allow additional connection parameters to be used with MSSentinel QueryProvider for
    authentication parameters (e.g. you can now supply authentication parameters like "client_id", "client_secret" to query_provider.connect())
  • msticpyconfig.yaml now supports using an "MSSentinel" key in place of "AzureSentinel"
  • Workspace entries in msticpyconfig.yaml support an Args subkey, where you can add authentication parameters - these will be supplied to the connect() method if not overridden on the command line. Like Args sections for other providers, the values here can be text or references to environment variables or Azure Key Vault secrets.
  • Fix to MSSentinel API update_incident to add full properties

Splunk Updates

  • Added jwt authentication token expiry check.

Other fixes

Fix for vtlookup3.py

  • Fixed problematic way of using nestasyncio - this was causing failures when run from a langchain agent.
    Fix for lookup/tilookup
  • If the progress parameter was not passed it would still try to cancel a non-existent progress task and cause an exception.
    QueryProviders
  • Fix split query time-ranges calculation - thanks to @pjain90 for spotting this.

What's Changed

  • Set up CI with 1ES Azure Pipelines by @ianhelle in #763
  • Update ws_config to handle kqlmagic connection strings by @ianhelle in #767
  • Fix split query time-ranges calculation by @ianhelle in #762
  • Add support for ruff and u/p devcontainer by @ianhelle in #765
  • Add jwt auth token expire check and modify some messages when connecting Splunk by @Tatsuya-hasegawa in #770
  • WSConfig updates by @ianhelle in #771
  • Pass true for props into _build_sent_data when calling update_incident by @kylelol in #774
  • Changing cert thumbprint from Sha1 to Sha256 in Az Kusto driver by @ianhelle in #775

New Contributors

Full Changelog: v2.11.0...v2.12.0

Sentinel Split Query fix

25 Mar 23:58
b3daf13
Compare
Choose a tag to compare

This is a minor release mainly to add a warning for Kusto/Sentinel queries that return partial results.
A close friend of MSTICPy (thx @Cyb3r-Monk) had spotted that MSTICPy does not report partial results when doing split queries so it's possible to lose data from the query range silently.

Due to an unfortunate admin error, the fix for this was committed direct to main, so no PR for this is available. :-(

If you want the query to fail (throw an exception) rather than just warn you can supply a new parameter fail_if_partial.
This only affects the Sentinel query provider and works for standard as well as split queries.

NOTE: the documentation has a typo and calls this fail_on_commit - we'll fix that in the next release to support both fail_if_partial and fail_on_partial

Example

qry_prov.exec_query(query_string, fail_if_partial=True)

What's Changed

New Contributors

Full Changelog: v2.10.0...v2.11.0

v2.10.0

12 Feb 19:00
d1c0912
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v2.9.0...v2.10.0

Defender Advanced hunting, IPQualityScore TI provider

15 Nov 22:08
74eeb2e
Compare
Choose a tag to compare

Some of the highlights of this release:

IPQualityScore

New TI provider submitted by @petebryan - provides a lot of interesting stats on IPs.

Defender Advanced Hunting API

Thanks to @d3vzer0 our MS Defender client is now able to use the support Graph-based API rather than the legacy
APIs. To use this, for the moment use the DataEnvironment name M365DGraph when you create
query provider. In the next 0.x release we will switch the other aliases for M365D, MDE, MDATP to use this
new interface and deprecate the existing ones.

Startup errors when running in unexpected environments.

init_notebook made some (incorrect) assumptions about when it would be running in a Synapse environment.
Azure Machine Learning have recently changed their default compute to be a Synapse environment.
Fixes here will correct failures due to faulty detection of environment type.

Startup fixes and perf improvements

We've optimized some of the imports done within the package at startup so msticpy should be quicker to
load.

Azure env credentials fix

Although we previously supported the Azure EnvironmentCredential credential type, our implementation allowed
you to use only with ClientID + ClientSecret. The changes allow it to be used with other supported
credential formats - notably username + password and certificate authentication using a certificate file.

Improvements to Entities

Although these are not visible to most people, we try to keep our Entity definitions in sync with the official
Microsoft "V3" entity definitions. We've added a few entity types and updated some of the attributes
to bring this in line, while still allowing backwards compatible attributes to be used.

What's Changed

New Contributors

  • @2xyo made their first contribution in #723

Full Changelog: v2.8.0...v2.9.0

Stability release

04 Oct 21:05
8314145
Compare
Choose a tag to compare

A few bugs had crept in over the last couple of releases: some due to buggy coding, some due the world moving forward. So, many items in this release are to address these.

Among the feature improvements are the following:

  • Documentation and scripts from @ccianelli22 for creating a MSTICPy install for use in isolated (no Internet) environments. This is super useful for customers operating in sovereign clouds or other air-gapped high-security environments.
  • Added Splunk authentication method using security token rather than username/password - thanks @Tatsuya-hasegawa
  • Query yaml file validation by @FlorianBracq
  • Paging for large CyberReason queries by @FlorianBracq
  • Modern method to obtain cloud-specific URL endpoints for Azure services. Previously, we were relying on msrestazure, which is now deprecated for this purpose. Many thanks to @ccianelli22 for the work to do this.
  • Fix (by me) for a bug I'd introduced with the switch to using Azure-monitor-query library for MS Sentinel. When using a connection string with this new driver, the logic failed to parse and extract details from this correctly. Many thanks to @cindraw for reporting this bug.

What's Changed

New Contributors

Full Changelog: v2.7.0...v2.8.0

2.8.0 pre-release

27 Sep 01:04
0d1b18e
Compare
Choose a tag to compare
2.8.0 pre-release Pre-release
Pre-release

Updated method to dynamically fetch Azure endpoints (rather than relying on deprecated msrestazure).
Updated version of Insight data provider