Skip to content

Latest commit

 

History

History
114 lines (93 loc) · 6.34 KB

README.md

File metadata and controls

114 lines (93 loc) · 6.34 KB

System mapper

An utility to retrieve infrastructure and system information from a cloud provider (Azure for now), store the info in a graph database (Neo4j) and visualize it using Dash

Requirements

Python Libraries and CLI utilities

Python

CLI

Data persistency and APIs use (database and IIS Administration API)

  • Neo4j >= 3.4 and supported by neomodel (tested using kernel version 3.4.0):
    • The compatible APOC plugin needs to be installed too (tested using 3.4.0.8). Note: Some neo4j config is needed to run some queries using APOC. Please add at the end of your neo4j.conf file the following lines:
    #***********************************************************
    # APOC
    #***********************************************************
    dbms.security.procedures.unrestricted=apoc.*
    apoc.export.file.enabled=true
    apoc.import.file.use_neo4j_config=false
  • Install, in the accesible VMs, IIS Administration API. More info about the IIS API. You should grant read access to the API to a user. For that, the appsettings.json of the IIS Administration API needs to be change like (cors settings, files access settings and security settings)

Setup

  • Clone the repository.
  • Install the required Python packages referenced above (pip install -r requirements.txt).
  • Check that a Neo4j instance is running in the default port and has as user and password (for example( user -> neo4j and pass -> ne@4j) and properly configured (APOC plugin) as stated above.
  • Config the relevant options in the config.json file (you can specify the path to the file using a .env file with the env var CONFIG_FILE_PATH or adding an config.json file add the system_mapper/ dir). For example a config that uses Azure and IIS Administration API to get the information:
{
    "initial_rule": "RULE_0_MULTIPLE_RESOURCE_GROUPS",
    "rules": [
        "RULE_0_MULTIPLE_RESOURCE_GROUPS",
        "RULE_1_MULTIPLE_SUSCRIPTIONS",
        "RULE_2_ORPHAN_NODES",
        "RULE_3_MAX_DEPENDENCIES"],
    "rules_mapping": {
        "RULE_0_MULTIPLE_RESOURCE_GROUPS": [
            "MATCH (n)-[r]-(m) MATCH (n)-[rg1:ELEMENT_RESOURCE_GROUP]-(nrg1) MATCH (m)-[rg2:ELEMENT_RESOURCE_GROUP]-(nrg2) WHERE NOT nrg1 = nrg2 RETURN n, r, m ",
            "n,r,m"
            ],
        "RULE_1_MULTIPLE_SUSCRIPTIONS": [
            "MATCH (n)-[r]-(m) MATCH (n)-[]-(np:Property {key: 'subscriptionId'}) MATCH (m)-[]-(mp:Property {key: 'subscriptionId'}) WHERE NOT np.value = mp.value RETURN n, r, m, np, mp ",
            "n,r,m,np,mp"
            ],
        "RULE_2_ORPHAN_NODES": [
            "MATCH (n) WHERE NOT (n)-[]-() RETURN n",
            "n"
            ],
        "RULE_3_MAX_DEPENDENCIES": [
            "MATCH (n)-[]-(m) RETURN n, COLLECT(m) as others ORDER BY SIZE(others) DESC LIMIT 1",
            "n"
            ]
        },
    "neo4j_database_url": "bolt://neo4j:ne@4j@localhost:7687",
    "database_strings": ["database", "base de datos", "MicrosoftSQLServer"],
    "port": 55539,
    "app_container_url": "/api/webserver/websites/",
    "app_container_token": "some token to access the ISS API. More info: https://docs.microsoft.com/en-us/IIS-Administration/management-portal/connecting",
    "app_container_user": "<windows username>",
    "app_container_password": "<windows user password>",
    "visualization_port": "80",
    "visualization_n_threads": "100",
    "visualization_dev": false,
    "visualization_host": "0.0.0.0"
}

Some notes regarding the config file:

  • initial_rule is the name of the initial rule to apply when checking the rule visualization.

  • rules are the list of available rules (which need to match the rules_mapping dict)

  • rules_mapping are custom cypher queries. Each entry has (1) the query, (2) the variables returned by the query

  • The other values are related with:

    • neo4j_database_url: Connection string to connect to the Neo4j database
    • database_strings: Strings to find in the VM information to classify a Virtual Machine as a Database node
    • IIS related config:
      • port: Connection port for the IIS management API (the API needs to be enabled in the Virtual Machines)
      • app_container_url: relative url to use to start to get the INFO. For now THE ONLY SUPPORTED ONE is /api/webserver/websites/.
      • app_container_token: token to use to authenticate the request made to the IIS Administration API or relevant API. See IIS Administration access tokens
      • app_container_user: windows user to use to authenticate via NTLM in case of IIS Admin API
      • app_container_password: windows user password to authenticate via NTLM in case of IIS Admin API
    • Visualization dashboard related config:
      • visualization_port: Port for the server to launch the dash app.
      • visualization_n_threads: Number of threads the server in prod mode will use.
      • visualization_dev: If the launched dash app is dev mode (run from Dash) or prod mode (waitress).
      • visualization_host: Host for the server.

Run

From the root directory run and after setting up the environment:

python -m system_mapper.main