Module: extractor.py
- Purpose:
This module provides abstract syntax tree node visitation and attribute extraction functionality.
- Platform:
Linux/Windows | Python 3.10+
- Developer:
J Berendt
- Email:
- Comments:
n/a
- Example:
Example code use:
>>> from badsnakes.libs.parser import Parser >>> from badsnakes.libs.extractor import Extractor >>> p = Parser() >>> e = Extractor() >>> p.parse(path='hello.py') >>> e.extract(node=p.ast_) # Display the extracted nodes. >>> e.display()
- class badsnakes.libs.extractor.Extractor[source]
Bases:
NodeVisitorInspect, extract and store relevant AST node attributes.
- display(name: str = None)[source]
Display the extracted contents.
The extracted attributes for each of the following AST nodes are displayed here:
ast.Assign
ast.Attribute
ast.Call
ast.Constants
ast.FunctionDef
ast.Import
ast.ImportFrom
- Parameters:
name (str, optional) – Name of the Python module being displayed. Defaults to None.
- extract(node: Module)[source]
Extract and store relevant attributes from a parsed AST.
This method is an alias for the
ast.NodeVisitor.visit()which is called directly, after the docstrings have been extracted.- Parameters:
node (ast.Module) – Starting node to be visited from which attributes are to be extracted.
- visit_Assign(node: Assign)[source]
Extract attributes of interest from
ast.Assignnodes.Generally, the assignments are used by the analyser to detect (very) long strings, or suspicious module or function aliasing.
For example:
[A very very long string which may be base64 encoded code]
A URL including ‘http’
cexe = exec
lave = eval
_i = __import__
- Parameters:
node (ast.Assign) – A node of type
ast.Assign.
- visit_Attribute(node: Attribute)[source]
Extract attributes of interest from
ast.Attributenodes.For example:
__builtins__.__getattribute__ctypes.windllos.system
- Parameters:
node (ast.Attribute) – A node of type
ast.Attribute.
- visit_Call(node: Call)[source]
Extract attributes of interest from
ast.Callnodes.Generally, function calls are used by the analyser to detect calls to functions which are generally considered unsafe, or used for suspicious activity.
Additionally, any arguments into these function calls are stored into the
_argsclass attribute, to be later added to theModule.argumentsobject.For example:
Calls
compile,execorevalDisguised imports using
__import__Calls to
requests.post
- Parameters:
node (ast.Call) – A node of type
ast.Call.
- visit_Constant(node: Constant)[source]
Extract attributes of interest from
ast.Constantnodes.Generally, the constants of interest here are strings. The extracted strings will be compared against the blacklisted strings to determine if any suspicious activities are being attempted.
- Docstrings:
Often times, a docstring containing benign text such as a semi-colon or the term ‘execute’ can flag a module as dangerous during a string search.
Because of this, the AST is walked to collect and store all docstrings when
extract()method is called. A constant node is only stored by this method for analysis if the constant’s value was not found in the stored docstrings. For further rationale on this, please refer to the_extract_docstrings()method.
For example:
Calls to cmd.exe or powershell
References to Bitcoin or other payment demands
Windows registry key paths
- Parameters:
node (ast.Constant) – A node of type
ast.Constant.
- visit_FunctionDef(node: FunctionDef)[source]
Extract attributes of interest from
ast.FunctionDefnodes.Generally, the analyser will use these nodes in search of obfuscated function names, indicating suspicious activity.
For example:
____0xb1_00OO00OO_01001001
- Parameters:
node (ast.FunctionDef) – A node of type
ast.FunctionDef.
- visit_Import(node: Import)[source]
Extract attributes of interest from
ast.Importnodes.Generally, the analyser will use these nodes in search of module imports which may indicate suspicious activity.
For example:
import requests
import winreg
import ctypes as ct
import win32api as _win32api
import win32con as _win32con
- Parameters:
node (ast.Import) – A node of type
ast.Import.
- visit_ImportFrom(node: ImportFrom)[source]
Extract attributes of interest from
ast.ImportFromnodes.Generally, the analyser will use these nodes in search of module imports which may indicate suspicious activity.
For example:
from win32api import SetFileAttributes
from win32con import SRCAND, FILE_ATTRIBUTE_HIDDEN
from win32file import CreateFileW, WriteFile, CloseHandle
- Parameters:
node (ast.ImportFrom) – A node of type
ast.ImportFrom.
- _extract_docstrings(node: Module)[source]
Collect all docstrings in the module and store.
- Parameters:
node (ast.Module) – Top-level AST node to be searched.
The extracted (uncleaned) docstrings are stored into the
_docsattribute. A constant is only tested if the value is not in the_docsattribute.- Rationale:
Extracting and storing docstrings lets us put simple strings such as
';'and'()'inconfig.tomlunder the[analyser.constant.dangerous]and[analyser.constant.suspect]tables without having a false-positive trigger for the string being somewhere in the docstring.
- generic_visit(node)
Called if no explicit visitor function exists for a node.
- visit(node)
Visit a node.