plaso.engine package¶
Submodules¶
plaso.engine.artifact_filters module¶
Helper to create filters based on forensic artifact definitions.
-
class
plaso.engine.artifact_filters.
ArtifactDefinitionsFilterHelper
(artifacts_registry, artifact_filters, knowledge_base)[source]¶ Bases:
object
Helper to create filters based on artifact definitions.
Builds extraction filters from forensic artifact definitions.
For more information about Forensic Artifacts see: https://github.com/ForensicArtifacts/artifacts/blob/master/docs/Artifacts%20definition%20format%20and%20style%20guide.asciidoc
-
BuildFindSpecs
(environment_variables=None)[source]¶ Builds find specifications from artifact definitions.
The resulting find specifications are set in the knowledge base.
Parameters: environment_variables (Optional[list[EnvironmentVariableArtifact]]) – environment variables.
-
BuildFindSpecsFromFileArtifact
(source_path, path_separator, environment_variables, user_accounts)[source]¶ Builds find specifications from a file source type.
Parameters: - source_path (str) – file system path defined by the source.
- path_separator (str) – file system path segment separator.
- environment_variables (list[str]) – environment variable attributes used to dynamically populate environment variables in key.
- user_accounts (list[str]) – identified user accounts stored in the knowledge base.
Returns: find specifications for the file source type.
Return type: list[dfvfs.FindSpec]
-
BuildFindSpecsFromRegistryArtifact
(source_key_path)[source]¶ Build find specifications from a Windows Registry source type.
Parameters: source_key_path (str) – Windows Registry key path defined by the source. Returns: - find specifications for the Windows Registry
- source type.
Return type: list[dfwinreg.FindSpec]
-
static
CheckKeyCompatibility
()[source]¶ Checks if a Windows Registry key path is supported by dfWinReg.
Parameters: key_path (str) – path of the Windows Registry key. Returns: True if key is compatible or False if not. Return type: bool
-
KNOWLEDGE_BASE_VALUE
= 'ARTIFACT_FILTERS'¶
-
plaso.engine.configurations module¶
Processing configuration classes.
-
class
plaso.engine.configurations.
CredentialConfiguration
(credential_data=None, credential_type=None, path_spec=None)[source]¶ Bases:
plaso.containers.interface.AttributeContainer
Configuration settings for a credential.
-
credential_data
¶ bytes – credential data.
-
credential_type
¶ str – credential type.
-
path_spec
¶ dfvfs.PathSpec – path specification.
-
CONTAINER_TYPE
= 'credential_configuration'¶
-
-
class
plaso.engine.configurations.
EventExtractionConfiguration
[source]¶ Bases:
plaso.containers.interface.AttributeContainer
Configuration settings for event extraction.
These settings are primarily used by the parser mediator.
-
filter_object
¶ objectfilter.Filter – filter that specifies which events to include.
-
text_prepend
¶ str – text to prepend to every event.
-
CONTAINER_TYPE
= 'event_extraction_configuration'¶
-
-
class
plaso.engine.configurations.
ExtractionConfiguration
[source]¶ Bases:
plaso.containers.interface.AttributeContainer
Configuration settings for extraction.
These settings are primarily used by the extraction worker.
-
hasher_file_size_limit
¶ int – maximum file size that hashers should process, where 0 or None represents unlimited.
-
hasher_names_string
¶ str – comma separated string of names of hashers to use during processing.
-
process_archives
¶ bool – True if archive files should be scanned for file entries.
-
process_compressed_streams
¶ bool – True if file content in compressed streams should be processed.
-
yara_rules_string
¶ str – Yara rule definitions.
-
CONTAINER_TYPE
= 'extraction_configuration'¶
-
-
class
plaso.engine.configurations.
InputSourceConfiguration
[source]¶ Bases:
plaso.containers.interface.AttributeContainer
Configuration settings of an input source.
-
mount_path
¶ str – path of a “mounted” directory input source.
-
CONTAINER_TYPE
= 'input_source'¶
-
-
class
plaso.engine.configurations.
ProcessingConfiguration
[source]¶ Bases:
plaso.containers.interface.AttributeContainer
Configuration settings for processing.
-
artifact_filters
¶ Optional list[str] – names of artifact definitions that are used for filtering file system and Windows Registry key paths.
-
credentials
¶ list[CredentialConfiguration] – credential configurations.
-
data_location
¶ str – path to the data files.
-
debug_output
¶ bool – True if debug output should be enabled.
-
event_extraction
¶ EventExtractionConfiguration – event extraction configuration.
-
extraction
¶ ExtractionConfiguration – extraction configuration.
-
filter_file
¶ str – path to a file with find specifications.
-
input_source
¶ InputSourceConfiguration – input source configuration.
-
log_filename
¶ str – name of the log file.
-
parser_filter_expression
¶ str – parser filter expression, where None represents all parsers and plugins.
-
preferred_year
¶ int – preferred initial year value for year-less date and time values.
-
profiling
¶ ProfilingConfiguration – profiling configuration.
-
temporary_directory
¶ str – path of the directory for temporary files.
-
CONTAINER_TYPE
= 'processing_configuration'¶
-
-
class
plaso.engine.configurations.
ProfilingConfiguration
[source]¶ Bases:
plaso.containers.interface.AttributeContainer
Configuration settings for profiling.
-
directory
¶ str – path to the directory where the profiling sample files should be stored.
-
profilers
¶ set(str) – names of the profilers to enable. Supported profilers are:
- ‘guppy’, which profiles memory usage using guppy;
- ‘memory’, which profiles memory usage;
- ‘parsers’, which profiles CPU time consumed by individual parsers;
- ‘processing’, which profiles CPU time consumed by different parts of processing;
- ‘serializers’, which profiles CPU time consumed by individual serializers.
- ‘storage’, which profiles storage reads and writes.
-
sample_rate
¶ int – the profiling sample rate. Contains the number of event sources processed.
-
CONTAINER_TYPE
= 'profiling_configuration'¶
-
HaveProfileMemory
()[source]¶ Determines if memory profiling is configured.
Returns: True if memory profiling is configured. Return type: bool
-
HaveProfileMemoryGuppy
()[source]¶ Determines if memory profiling with guppy is configured.
Returns: True if memory profiling with guppy is configured. Return type: bool
-
HaveProfileParsers
()[source]¶ Determines if parsers profiling is configured.
Returns: True if parsers profiling is configured. Return type: bool
-
HaveProfileProcessing
()[source]¶ Determines if processing profiling is configured.
Returns: True if processing profiling is configured. Return type: bool
-
HaveProfileSerializers
()[source]¶ Determines if serializers profiling is configured.
Returns: True if serializers profiling is configured. Return type: bool
-
HaveProfileStorage
()[source]¶ Determines if storage profiling is configured.
Returns: True if storage profiling is configured. Return type: bool
-
plaso.engine.engine module¶
The processing engine.
-
class
plaso.engine.engine.
BaseEngine
[source]¶ Bases:
object
Processing engine interface.
-
knowledge_base
¶ KnowledgeBase – knowledge base.
-
classmethod
BuildArtifactsRegistry
(artifact_definitions_path, custom_artifacts_path)[source]¶ Build Find Specs from artifacts or filter file if available.
Parameters: - artifact_definitions_path (str) – path to artifact definitions file.
- custom_artifacts_path (str) – path to custom artifact definitions file.
Returns: artifact definitions registry.
Return type: artifacts.ArtifactDefinitionsRegistry
Raises: RuntimeError
– if no valid FindSpecs are built.
-
classmethod
BuildFilterFindSpecs
(artifact_definitions_path, custom_artifacts_path, knowledge_base_object, artifact_filter_names=None, filter_file_path=None)[source]¶ Builds find specifications from artifacts or filter file if available.
Parameters: - artifact_definitions_path (str) – path to artifact definitions file.
- custom_artifacts_path (str) – path to custom artifact definitions file.
- knowledge_base_object (KnowledgeBase) – knowledge base.
- artifact_filter_names (Optional[list[str]]) – names of artifact definitions that are used for filtering file system and Windows Registry key paths.
- filter_file_path (Optional[str]) – Path of filter file.
Returns: find specifications for the file source type.
Return type: list[dfvfs.FindSpec]
Raises: RuntimeError
– if no valid FindSpecs are built.
-
classmethod
CreateSession
(artifact_filter_names=None, command_line_arguments=None, debug_mode=False, filter_file_path=None, preferred_encoding='utf-8', preferred_time_zone=None, preferred_year=None)[source]¶ Creates a session attribute container.
Parameters: - artifact_filter_names (Optional[list[str]]) – names of artifact definitions that are used for filtering file system and Windows Registry key paths.
- command_line_arguments (Optional[str]) – the command line arguments.
- debug_mode (bool) – True if debug mode was enabled.
- filter_file_path (Optional[str]) – path to a file with find specifications.
- preferred_encoding (Optional[str]) – preferred encoding.
- preferred_time_zone (Optional[str]) – preferred time zone.
- preferred_year (Optional[int]) – preferred year.
Returns: session attribute container.
Return type:
-
GetSourceFileSystem
(source_path_spec, resolver_context=None)[source]¶ Retrieves the file system of the source.
Parameters: - source_path_spec (dfvfs.PathSpec) – path specifications of the sources to process.
- resolver_context (dfvfs.Context) – resolver context.
Returns: containing:
dfvfs.FileSystem: file system path.PathSpec: mount point path specification. The mount point path
specification refers to either a directory or a volume on a storage media device or image. It is needed by the dfVFS file system searcher (FileSystemSearcher) to indicate the base location of the file system.
Return type: tuple
Raises: RuntimeError
– if source file system path specification is not set.
-
PreprocessSources
(artifacts_registry_object, source_path_specs, resolver_context=None)[source]¶ Preprocesses the sources.
Parameters: - artifacts_registry_object (artifacts.ArtifactDefinitionsRegistry) – artifact definitions registry.
- source_path_specs (list[dfvfs.PathSpec]) – path specifications of the sources to process.
- resolver_context (Optional[dfvfs.Context]) – resolver context.
-
plaso.engine.extractors module¶
The extractor class definitions.
An extractor is a class used to extract information from “raw” data.
-
class
plaso.engine.extractors.
EventExtractor
(parser_filter_expression=None)[source]¶ Bases:
object
Event extractor.
An event extractor extracts events from event sources.
-
ParseDataStream
(parser_mediator, file_entry, data_stream_name)[source]¶ Parses a data stream of a file entry with the enabled parsers.
Parameters: - parser_mediator (ParserMediator) – parser mediator.
- file_entry (dfvfs.FileEntry) – file entry.
- data_stream_name (str) – data stream name.
Raises: RuntimeError
– if the file-like object or the parser object is missing.
-
ParseFileEntryMetadata
(parser_mediator, file_entry)[source]¶ Parses the file entry metadata e.g. file system data.
Parameters: - parser_mediator (ParserMediator) – parser mediator.
- file_entry (dfvfs.FileEntry) – file entry.
-
ParseMetadataFile
(parser_mediator, file_entry, data_stream_name)[source]¶ Parses a metadata file.
Parameters: - parser_mediator (ParserMediator) – parser mediator.
- file_entry (dfvfs.FileEntry) – file entry.
- data_stream_name (str) – data stream name.
-
-
class
plaso.engine.extractors.
PathSpecExtractor
(duplicate_file_check=False)[source]¶ Bases:
object
Path specification extractor.
A path specification extractor extracts path specification from a source directory, file or storage media device or image.
-
ExtractPathSpecs
(path_specs, find_specs=None, recurse_file_system=True, resolver_context=None)[source]¶ Extracts path specification from a specific source.
Parameters: - path_specs (Optional[list[dfvfs.PathSpec]]) – path specifications.
- find_specs (Optional[list[dfvfs.FindSpec]]) – find specifications.
- recurse_file_system (Optional[bool]) – True if extraction should recurse into a file system.
- resolver_context (Optional[dfvfs.Context]) – resolver context.
Yields: dfvfs.PathSpec – path specification of a file entry found in the source.
-
plaso.engine.filter_file module¶
Filter file.
-
class
plaso.engine.filter_file.
FilterFile
(path)[source]¶ Bases:
object
Filter file.
A filter file contains one or more path filters.
A path filter may contain path expansion attributes. Such an attribute is defined as anything within a curly bracket, for example “System{my_attribute}PathKeyname”. If the attribute “my_attribute” is defined its runtime value will be replaced with placeholder in the path filter such as “SystemMyValuePathKeyname”.
If the path filter needs to have curly brackets in the path then these need to be escaped with another curly bracket, for example “System{my_attribute}{{123-AF25-E523}}KeyName”, where “{{123-AF25-E523}}” will be replaced with “{123-AF25-E523}” at runtime.
-
BuildFindSpecs
(environment_variables=None)[source]¶ Build find specification from a filter file.
Parameters: environment_variables (Optional[list[EnvironmentVariableArtifact]]) – environment variables. Returns: find specification. Return type: list[dfvfs.FindSpec]
-
plaso.engine.knowledge_base module¶
The artifact knowledge base object.
The knowledge base is filled by user provided input and the pre-processing phase. It is intended to provide successive phases, like the parsing and analysis phases, with essential information like e.g. the timezone and codepage of the source data.
-
class
plaso.engine.knowledge_base.
KnowledgeBase
[source]¶ Bases:
object
Class that implements the artifact knowledge base.
-
AddEnvironmentVariable
(environment_variable)[source]¶ Adds an environment variable.
Parameters: environment_variable (EnvironmentVariableArtifact) – environment variable artifact. Raises: KeyError
– if the environment variable already exists.
-
AddUserAccount
(user_account, session_identifier=0)[source]¶ Adds an user account.
Parameters: - user_account (UserAccountArtifact) – user account artifact.
- session_identifier (Optional[str])) – session identifier, where CURRENT_SESSION represents the active session.
Raises: KeyError
– if the user account already exists.
-
CURRENT_SESSION
= 0¶
-
GetEnvironmentVariable
(name)[source]¶ Retrieves an environment variable.
Parameters: name (str) – name of the environment variable. Returns: - environment variable artifact or None
- if there was no value set for the given name.
Return type: EnvironmentVariableArtifact
-
GetEnvironmentVariables
()[source]¶ Retrieves the environment variables.
Returns: environment variable artifacts. Return type: list[EnvironmentVariableArtifact]
-
GetHostname
(session_identifier=0)[source]¶ Retrieves the hostname related to the event.
If the hostname is not stored in the event it is determined based on the preprocessing information that is stored inside the storage file.
Parameters: session_identifier (Optional[str])) – session identifier, where CURRENT_SESSION represents the active session. Returns: hostname. Return type: str
-
GetStoredHostname
()[source]¶ Retrieves the stored hostname.
The hostname is determined based on the preprocessing information that is stored inside the storage file.
Returns: hostname. Return type: str
-
GetSystemConfigurationArtifact
(session_identifier=0)[source]¶ Retrieves the knowledge base as a system configuration artifact.
Parameters: session_identifier (Optional[str])) – session identifier, where CURRENT_SESSION represents the active session. Returns: system configuration artifact. Return type: SystemConfigurationArtifact
-
GetUsernameByIdentifier
(user_identifier, session_identifier=0)[source]¶ Retrieves the username based on an user identifier.
Parameters: - user_identifier (str) – user identifier, either a UID or SID.
- session_identifier (Optional[str])) – session identifier, where CURRENT_SESSION represents the active session.
Returns: username.
Return type: str
-
GetUsernameForPath
(path)[source]¶ Retrieves a username for a specific path.
This is determining if a specific path is within a user’s directory and returning the username of the user if so.
Parameters: path (str) – path. Returns: - username or None if the path does not appear to be within a user’s
- directory.
Return type: str
-
GetValue
(identifier, default_value=None)[source]¶ Retrieves a value by identifier.
Parameters: - identifier (str) – case insensitive unique identifier for the value.
- default_value (object) – default value.
Returns: value or default value if not available.
Return type: object
Raises: TypeError
– if the identifier is not a string type.
-
HasUserAccounts
()[source]¶ Determines if the knowledge base contains user accounts.
Returns: True if the knowledge base contains user accounts. Return type: bool
-
ReadSystemConfigurationArtifact
(system_configuration, session_identifier=0)[source]¶ Reads the knowledge base values from a system configuration artifact.
Note that this overwrites existing values in the knowledge base.
Parameters: - system_configuration (SystemConfigurationArtifact) – system configuration artifact.
- session_identifier (Optional[str])) – session identifier, where CURRENT_SESSION represents the active session.
-
SetCodepage
(codepage)[source]¶ Sets the codepage.
Parameters: codepage (str) – codepage. Raises: ValueError
– if the codepage is not supported.
-
SetEnvironmentVariable
(environment_variable)[source]¶ Sets an environment variable.
Parameters: environment_variable (EnvironmentVariableArtifact) – environment variable artifact.
-
SetHostname
(hostname, session_identifier=0)[source]¶ Sets a hostname.
Parameters: - hostname (HostnameArtifact) – hostname artifact.
- session_identifier (Optional[str])) – session identifier, where CURRENT_SESSION represents the active session.
-
SetTimeZone
(time_zone)[source]¶ Sets the time zone.
Parameters: time_zone (str) – time zone. Raises: ValueError
– if the timezone is not supported.
-
SetValue
(identifier, value)[source]¶ Sets a value by identifier.
Parameters: - identifier (str) – case insensitive unique identifier for the value.
- value (object) – value.
Raises: TypeError
– if the identifier is not a string type.
-
codepage
¶ str – codepage of the current session.
-
hostname
¶ str – hostname of the current session.
-
timezone
¶ datetime.tzinfo – timezone of the current session.
-
user_accounts
¶ list[UserAccountArtifact] – user accounts of the current session.
-
year
¶ int – year of the current session.
-
plaso.engine.logger module¶
The engine sub module logger.
plaso.engine.path_helper module¶
The path helper.
-
class
plaso.engine.path_helper.
PathHelper
[source]¶ Bases:
object
Class that implements the path helper.
-
classmethod
AppendPathEntries
(path, path_separator, count, skip_first)[source]¶ Appends wildcard entries to end of path.
Will append wildcard * to given path building a list of strings for “count” iterations, skipping the first directory if skip_first is true.
Parameters: - path (str) – Path to append wildcards to.
- path_separator (str) – path segment separator.
- count (int) – Number of entries to be appended.
- skip_first (bool) – Whether or not to skip first entry to append.
Returns: Paths that were expanded from the path with wildcards.
Return type: list[str]
-
classmethod
ExpandRecursiveGlobs
(path, path_separator)[source]¶ Expands recursive like globs present in an artifact path.
If a path ends in ‘**’, with up to two optional digits such as ‘10’, the ‘’ will recursively match all files and zero or more directories from the specified path. The optional digits indicate the recursion depth. By default recursion depth is 10 directories.
If the glob is followed by the specified path segment separator, only directories and subdirectories will be matched.
Parameters: - path (str) – path to be expanded.
- path_separator (str) – path segment separator.
Returns: String path expanded for each glob.
Return type: list[str]
-
classmethod
ExpandUsersHomeDirectoryPath
(path, user_accounts)[source]¶ Expands a path to contain all users home or profile directories.
Expands the GRR artifacts path variable “%%users.homedir%%”.
Parameters: - path (str) – Windows path with environment variables.
- user_accounts (list[UserAccountArtifact]) – user accounts.
Returns: paths returned for user accounts without a drive letter.
Return type: list[str]
-
classmethod
ExpandWindowsPath
(path, environment_variables)[source]¶ Expands a Windows path containing environment variables.
Parameters: - path (str) – Windows path with environment variables.
- environment_variables (list[EnvironmentVariableArtifact]) – environment variables.
Returns: expanded Windows path.
Return type: str
-
classmethod
GetDisplayNameForPathSpec
(path_spec, mount_path=None, text_prepend=None)[source]¶ Retrieves the display name of a path specification.
Parameters: - path_spec (dfvfs.PathSpec) – path specification.
- mount_path (Optional[str]) – path where the file system that is used by the path specification is mounted, such as “/mnt/image”. The mount path will be stripped from the absolute path defined by the path specification.
- text_prepend (Optional[str]) – text to prepend.
Returns: human readable version of the path specification or None.
Return type: str
-
classmethod
GetRelativePathForPathSpec
(path_spec, mount_path=None)[source]¶ Retrieves the relative path of a path specification.
If a mount path is defined the path will be relative to the mount point, otherwise the path is relative to the root of the file system that is used by the path specification.
Parameters: - path_spec (dfvfs.PathSpec) – path specification.
- mount_path (Optional[str]) – path where the file system that is used by the path specification is mounted, such as “/mnt/image”. The mount path will be stripped from the absolute path defined by the path specification.
Returns: relative path or None.
Return type: str
-
classmethod
plaso.engine.plaso_queue module¶
Queue management implementation for Plaso.
This file contains an implementation of a queue used by plaso for queue management.
The queue has been abstracted in order to provide support for different implementations of the queueing mechanism, to support multi processing and scalability.
plaso.engine.process_info module¶
Information about running process.
plaso.engine.processing_status module¶
Processing status classes.
-
class
plaso.engine.processing_status.
ProcessStatus
[source]¶ Bases:
object
The status of an individual process.
-
display_name
¶ str – human readable of the file entry currently being processed by the process.
-
identifier
¶ str – process identifier.
-
last_running_time
¶ int – timestamp of the last update when the process had a running process status.
-
number_of_consumed_errors
¶ int – total number of errors consumed by the process.
-
number_of_consumed_errors_delta
¶ int – number of errors consumed by the process since the last status update.
int – total number of event tags consumed by the process.
int – number of event tags consumed by the process since the last status update.
-
number_of_consumed_events
¶ int – total number of events consumed by the process.
-
number_of_consumed_events_delta
¶ int – number of events consumed by the process since the last status update.
-
number_of_consumed_reports
¶ int – total number of event reports consumed by the process.
-
number_of_consumed_reports_delta
¶ int – number of event reports consumed by the process since the last status update.
-
number_of_consumed_sources
¶ int – total number of event sources consumed by the process.
-
number_of_consumed_sources_delta
¶ int – number of event sources consumed by the process since the last status update.
-
number_of_produced_errors
¶ int – total number of errors produced by the process.
-
number_of_produced_errors_delta
¶ int – number of errors produced by the process since the last status update.
int – total number of event tags produced by the process.
int – number of event tags produced by the process since the last status update.
-
number_of_produced_events
¶ int – total number of events produced by the process.
-
number_of_produced_events_delta
¶ int – number of events produced by the process since the last status update.
-
number_of_produced_reports
¶ int – total number of event reports produced by the process.
-
number_of_produced_reports_delta
¶ int – number of event reports produced by the process since the last status update.
-
number_of_produced_sources
¶ int – total number of event sources produced by the process.
-
number_of_produced_sources_delta
¶ int – number of event sources produced by the process since the last status update.
-
pid
¶ int – process identifier (PID).
-
status
¶ str – human readable status indication e.g. ‘Hashing’, ‘Idle’.
-
used_memory
¶ int – size of used memory in bytes.
-
UpdateNumberOfErrors
(number_of_consumed_errors, number_of_produced_errors)[source]¶ Updates the number of errors.
Parameters: - number_of_consumed_errors (int) – total number of errors consumed by the process.
- number_of_produced_errors (int) – total number of errors produced by the process.
Returns: True if either number of errors has increased.
Return type: bool
Raises: ValueError
– if the consumed or produced number of errors is smaller than the value of the previous update.
-
UpdateNumberOfEventReports
(number_of_consumed_reports, number_of_produced_reports)[source]¶ Updates the number of event reports.
Parameters: - number_of_consumed_reports (int) – total number of event reports consumed by the process.
- number_of_produced_reports (int) – total number of event reports produced by the process.
Returns: True if either number of event reports has increased.
Return type: bool
Raises: ValueError
– if the consumed or produced number of event reports is smaller than the value of the previous update.
-
UpdateNumberOfEventSources
(number_of_consumed_sources, number_of_produced_sources)[source]¶ Updates the number of event sources.
Parameters: - number_of_consumed_sources (int) – total number of event sources consumed by the process.
- number_of_produced_sources (int) – total number of event sources produced by the process.
Returns: True if either number of event sources has increased.
Return type: bool
Raises: ValueError
– if the consumed or produced number of event sources is smaller than the value of the previous update.
-
UpdateNumberOfEventTags
(number_of_consumed_event_tags, number_of_produced_event_tags)[source]¶ Updates the number of event tags.
Parameters: - number_of_consumed_event_tags (int) – total number of event tags consumed by the process.
- number_of_produced_event_tags (int) – total number of event tags produced by the process.
Returns: True if either number of event tags has increased.
Return type: bool
Raises: ValueError
– if the consumed or produced number of event tags is smaller than the value of the previous update.
-
UpdateNumberOfEvents
(number_of_consumed_events, number_of_produced_events)[source]¶ Updates the number of events.
Parameters: - number_of_consumed_events (int) – total number of events consumed by the process.
- number_of_produced_events (int) – total number of events produced by the process.
Returns: True if either number of events has increased.
Return type: bool
Raises: ValueError
– if the consumed or produced number of events is smaller than the value of the previous update.
-
-
class
plaso.engine.processing_status.
ProcessingStatus
[source]¶ Bases:
object
The status of the overall extraction process (processing).
-
aborted
¶ bool – True if processing was aborted.
-
error_path_specs
¶ list[dfvfs.PathSpec] – path specifications that caused critical errors during processing.
-
foreman_status
¶ ProcessingStatus – foreman processing status.
-
start_time
¶ float – time that the processing was started. Contains the number of micro seconds since January 1, 1970, 00:00:00 UTC.
-
tasks_status
¶ TasksStatus – status information about tasks.
-
UpdateForemanStatus
(identifier, status, pid, used_memory, display_name, number_of_consumed_sources, number_of_produced_sources, number_of_consumed_events, number_of_produced_events, number_of_consumed_event_tags, number_of_produced_event_tags, number_of_consumed_errors, number_of_produced_errors, number_of_consumed_reports, number_of_produced_reports)[source]¶ Updates the status of the foreman.
Parameters: - identifier (str) – foreman identifier.
- status (str) – human readable status of the foreman e.g. ‘Idle’.
- pid (int) – process identifier (PID).
- used_memory (int) – size of used memory in bytes.
- display_name (str) – human readable of the file entry currently being processed by the foreman.
- number_of_consumed_sources (int) – total number of event sources consumed by the foreman.
- number_of_produced_sources (int) – total number of event sources produced by the foreman.
- number_of_consumed_events (int) – total number of events consumed by the foreman.
- number_of_produced_events (int) – total number of events produced by the foreman.
- number_of_consumed_event_tags (int) – total number of event tags consumed by the foreman.
- number_of_produced_event_tags (int) – total number of event tags produced by the foreman.
- number_of_consumed_errors (int) – total number of errors consumed by the foreman.
- number_of_produced_errors (int) – total number of errors produced by the foreman.
- number_of_consumed_reports (int) – total number of event reports consumed by the process.
- number_of_produced_reports (int) – total number of event reports produced by the process.
-
UpdateTasksStatus
(tasks_status)[source]¶ Updates the tasks status.
Parameters: tasks_status (TasksStatus) – status information about tasks.
-
UpdateWorkerStatus
(identifier, status, pid, used_memory, display_name, number_of_consumed_sources, number_of_produced_sources, number_of_consumed_events, number_of_produced_events, number_of_consumed_event_tags, number_of_produced_event_tags, number_of_consumed_errors, number_of_produced_errors, number_of_consumed_reports, number_of_produced_reports)[source]¶ Updates the status of a worker.
Parameters: - identifier (str) – worker identifier.
- status (str) – human readable status of the worker e.g. ‘Idle’.
- pid (int) – process identifier (PID).
- used_memory (int) – size of used memory in bytes.
- display_name (str) – human readable of the file entry currently being processed by the worker.
- number_of_consumed_sources (int) – total number of event sources consumed by the worker.
- number_of_produced_sources (int) – total number of event sources produced by the worker.
- number_of_consumed_events (int) – total number of events consumed by the worker.
- number_of_produced_events (int) – total number of events produced by the worker.
- number_of_consumed_event_tags (int) – total number of event tags consumed by the worker.
- number_of_produced_event_tags (int) – total number of event tags produced by the worker.
- number_of_consumed_errors (int) – total number of errors consumed by the worker.
- number_of_produced_errors (int) – total number of errors produced by the worker.
- number_of_consumed_reports (int) – total number of event reports consumed by the process.
- number_of_produced_reports (int) – total number of event reports produced by the process.
-
workers_status
¶ The worker status objects sorted by identifier.
-
-
class
plaso.engine.processing_status.
TasksStatus
[source]¶ Bases:
object
The status of the tasks.
-
number_of_abandoned_tasks
¶ int – number of abandoned tasks.
-
number_of_queued_tasks
¶ int – number of active tasks.
-
number_of_tasks_pending_merge
¶ int – number of tasks pending merge.
-
number_of_tasks_processing
¶ int – number of tasks processing.
-
total_number_of_tasks
¶ int – total number of tasks.
-
plaso.engine.profilers module¶
The profiler classes.
-
class
plaso.engine.profilers.
CPUTimeMeasurement
[source]¶ Bases:
object
The CPU time measurement.
-
start_sample_time
¶ float – start sample time or None if not set.
-
total_cpu_time
¶ float – total CPU time or None if not set.
-
-
class
plaso.engine.profilers.
CPUTimeProfiler
(identifier, configuration)[source]¶ Bases:
plaso.engine.profilers.SampleFileProfiler
The CPU time profiler.
-
class
plaso.engine.profilers.
GuppyMemoryProfiler
(identifier, configuration)[source]¶ Bases:
object
The guppy-based memory profiler.
-
class
plaso.engine.profilers.
MemoryProfiler
(identifier, configuration)[source]¶ Bases:
plaso.engine.profilers.SampleFileProfiler
The memory profiler.
-
class
plaso.engine.profilers.
ProcessingProfiler
(identifier, configuration)[source]¶ Bases:
plaso.engine.profilers.CPUTimeProfiler
The processing profiler.
-
class
plaso.engine.profilers.
SampleFileProfiler
(identifier, configuration)[source]¶ Bases:
object
Shared functionality for sample file-based profilers.
-
class
plaso.engine.profilers.
SerializersProfiler
(identifier, configuration)[source]¶ Bases:
plaso.engine.profilers.CPUTimeProfiler
The serializers profiler.
-
class
plaso.engine.profilers.
StorageProfiler
(identifier, configuration)[source]¶ Bases:
plaso.engine.profilers.SampleFileProfiler
The storage profiler.
-
Sample
(operation, description, data_size, compressed_data_size)[source]¶ Takes a sample of data read or written for profiling.
Parameters: - operation (str) – operation, either ‘read’ or ‘write’.
- description (str) – description of the data read.
- data_size (int) – size of the data read in bytes.
- compressed_data_size (int) – size of the compressed data read in bytes.
-
-
class
plaso.engine.profilers.
TaskQueueProfiler
(identifier, configuration)[source]¶ Bases:
plaso.engine.profilers.SampleFileProfiler
The task queue profiler.
-
Sample
(tasks_status)[source]¶ Takes a sample of the status of queued tasks for profiling.
Parameters: tasks_status (TasksStatus) – status information about tasks.
-
plaso.engine.single_process module¶
The single process processing engine.
-
class
plaso.engine.single_process.
SingleProcessEngine
[source]¶ Bases:
plaso.engine.engine.BaseEngine
Class that defines the single process engine.
-
ProcessSources
(source_path_specs, storage_writer, resolver_context, processing_configuration, filter_find_specs=None, status_update_callback=None)[source]¶ Processes the sources.
Parameters: - source_path_specs (list[dfvfs.PathSpec]) – path specifications of the sources to process.
- storage_writer (StorageWriter) – storage writer for a session storage.
- resolver_context (dfvfs.Context) – resolver context.
- processing_configuration (ProcessingConfiguration) – processing configuration.
- filter_find_specs (Optional[list[dfvfs.FindSpec]]) – find specifications used in path specification extraction.
- status_update_callback (Optional[function]) – callback function for status updates.
Returns: processing status.
Return type:
-
plaso.engine.tagging_file module¶
Tagging file.
plaso.engine.worker module¶
The event extraction worker.
-
class
plaso.engine.worker.
EventExtractionWorker
(parser_filter_expression=None)[source]¶ Bases:
object
Event extraction worker.
The event extraction worker determines which parsers are suitable for parsing a particular file entry or data stream. The parsers extract relevant data from file system and or file content data. All extracted data is passed to the parser mediator for further processing.
-
last_activity_timestamp
¶ int – timestamp received that indicates the last time activity was observed.
-
processing_status
¶ str – human readable status indication such as: ‘Extracting’, ‘Hashing’.
-
GetAnalyzerNames
()[source]¶ Gets the names of the active analyzers.
Returns: names of active analyzers. Return type: list[str]
-
ProcessPathSpec
(mediator, path_spec)[source]¶ Processes a path specification.
Parameters: - mediator (ParserMediator) – mediates the interactions between parsers and other components, such as storage and abort signals.
- path_spec (dfvfs.PathSpec) – path specification.
-
SetExtractionConfiguration
(configuration)[source]¶ Sets the extraction configuration settings.
Parameters: configuration (ExtractionConfiguration) – extraction configuration.
-
SetProcessingProfiler
(processing_profiler)[source]¶ Sets the parsers profiler.
Parameters: processing_profiler (ProcessingProfiler) – processing profile.
-
plaso.engine.zeromq_queue module¶
ZeroMQ implementations of the Plaso queue interface.
-
class
plaso.engine.zeromq_queue.
ZeroMQBufferedQueue
(buffer_timeout_seconds=2, buffer_max_size=10000, delay_open=True, linger_seconds=10, maximum_items=1000, name='Unnamed', port=None, timeout_seconds=5)[source]¶ Bases:
plaso.engine.zeromq_queue.ZeroMQQueue
Parent class for buffered Plaso queues.
Buffered queues use a regular Python queue to store items that are pushed or popped from the queue without blocking on underlying ZeroMQ operations.
This class should not be instantiated directly, a subclass should be instantiated instead.
-
Close
(abort=False)[source]¶ Closes the queue.
Parameters: abort (Optional[bool]) – whether the Close is the result of an abort condition. If True, queue contents may be lost.
Raises: QueueAlreadyClosed
– if the queue is not started, or has already been closed.RuntimeError
– if closed or terminate event is missing.
-
-
class
plaso.engine.zeromq_queue.
ZeroMQBufferedReplyBindQueue
(buffer_timeout_seconds=2, buffer_max_size=10000, delay_open=True, linger_seconds=10, maximum_items=1000, name='Unnamed', port=None, timeout_seconds=5)[source]¶ Bases:
plaso.engine.zeromq_queue.ZeroMQBufferedReplyQueue
A Plaso queue backed by a ZeroMQ REP socket that binds to a port.
This queue may only be used to pop items, not to push.
-
SOCKET_CONNECTION_TYPE
= 1¶
-
-
class
plaso.engine.zeromq_queue.
ZeroMQBufferedReplyQueue
(buffer_timeout_seconds=2, buffer_max_size=10000, delay_open=True, linger_seconds=10, maximum_items=1000, name='Unnamed', port=None, timeout_seconds=5)[source]¶ Bases:
plaso.engine.zeromq_queue.ZeroMQBufferedQueue
Parent class for buffered Plaso queues backed by ZeroMQ REP sockets.
This class should not be instantiated directly, a subclass should be instantiated instead.
Instances of this class or subclasses may only be used to push items, not to pop.
-
PopItem
()[source]¶ Pops an item of the queue.
Provided for compatibility with the API, but doesn’t actually work.
Raises: WrongQueueType
– As Pop is not supported by this queue.
-
PushItem
(item, block=True)[source]¶ Push an item on to the queue.
If no ZeroMQ socket has been created, one will be created the first time this method is called.
Parameters: - item (object) – item to push on the queue.
- block (Optional[bool]) – whether the push should be performed in blocking or non-blocking mode.
Raises: QueueAlreadyClosed
– if the queue is closed.QueueFull
– if the internal buffer was full and it was not possible to push the item to the buffer within the timeout.RuntimeError
– if closed event is missing.
-
-
class
plaso.engine.zeromq_queue.
ZeroMQPullConnectQueue
(delay_open=True, linger_seconds=10, maximum_items=1000, name='Unnamed', port=None, timeout_seconds=5)[source]¶ Bases:
plaso.engine.zeromq_queue.ZeroMQPullQueue
A Plaso queue backed by a ZeroMQ PULL socket that connects to a port.
This queue may only be used to pop items, not to push.
-
SOCKET_CONNECTION_TYPE
= 2¶
-
-
class
plaso.engine.zeromq_queue.
ZeroMQPullQueue
(delay_open=True, linger_seconds=10, maximum_items=1000, name='Unnamed', port=None, timeout_seconds=5)[source]¶ Bases:
plaso.engine.zeromq_queue.ZeroMQQueue
Parent class for Plaso queues backed by ZeroMQ PULL sockets.
This class should not be instantiated directly, a subclass should be instantiated instead.
Instances of this class or subclasses may only be used to pop items, not to push.
-
PopItem
()[source]¶ Pops an item off the queue.
If no ZeroMQ socket has been created, one will be created the first time this method is called.
Returns: item from the queue.
Return type: object
Raises: KeyboardInterrupt
– if the process is sent a KeyboardInterrupt while popping an item.QueueEmpty
– if the queue is empty, and no item could be popped within the queue timeout.RuntimeError
– if closed or terminate event is missing.zmq.error.ZMQError
– if a ZeroMQ error occurs.
-
PushItem
(item, block=True)[source]¶ Pushes an item on to the queue.
Provided for compatibility with the API, but doesn’t actually work.
Parameters: - item (object) – item to push on the queue.
- block (Optional[bool]) – whether the push should be performed in blocking or non-blocking mode.
Raises: WrongQueueType
– As Push is not supported this queue.
-
-
class
plaso.engine.zeromq_queue.
ZeroMQPushBindQueue
(delay_open=True, linger_seconds=10, maximum_items=1000, name='Unnamed', port=None, timeout_seconds=5)[source]¶ Bases:
plaso.engine.zeromq_queue.ZeroMQPushQueue
A Plaso queue backed by a ZeroMQ PUSH socket that binds to a port.
This queue may only be used to push items, not to pop.
-
SOCKET_CONNECTION_TYPE
= 1¶
-
-
class
plaso.engine.zeromq_queue.
ZeroMQPushQueue
(delay_open=True, linger_seconds=10, maximum_items=1000, name='Unnamed', port=None, timeout_seconds=5)[source]¶ Bases:
plaso.engine.zeromq_queue.ZeroMQQueue
Parent class for Plaso queues backed by ZeroMQ PUSH sockets.
This class should not be instantiated directly, a subclass should be instantiated instead.
Instances of this class or subclasses may only be used to push items, not to pop.
-
PopItem
()[source]¶ Pops an item of the queue.
Provided for compatibility with the API, but doesn’t actually work.
Raises: WrongQueueType
– As Pull is not supported this queue.
-
PushItem
(item, block=True)[source]¶ Push an item on to the queue.
If no ZeroMQ socket has been created, one will be created the first time this method is called.
Parameters: - item (object) – item to push on the queue.
- block (Optional[bool]) – whether the push should be performed in blocking or non-blocking mode.
Raises: KeyboardInterrupt
– if the process is sent a KeyboardInterrupt while pushing an item.QueueFull
– if it was not possible to push the item to the queue within the timeout.RuntimeError
– if terminate event is missing.zmq.error.ZMQError
– if a ZeroMQ specific error occurs.
-
-
class
plaso.engine.zeromq_queue.
ZeroMQQueue
(delay_open=True, linger_seconds=10, maximum_items=1000, name='Unnamed', port=None, timeout_seconds=5)[source]¶ Bases:
plaso.engine.plaso_queue.Queue
Interface for a ZeroMQ backed queue.
-
name
¶ str – name to identify the queue.
-
port
¶ int – TCP port that the queue is connected or bound to. If the queue is not yet bound or connected to a port, this value will be None.
-
timeout_seconds
¶ int – number of seconds that calls to PopItem and PushItem may block for, before returning queue.QueueEmpty.
-
Close
(abort=False)[source]¶ Closes the queue.
Parameters: abort (Optional[bool]) – whether the Close is the result of an abort condition. If True, queue contents may be lost.
Raises: QueueAlreadyClosed
– if the queue is not started, or has already been closed.RuntimeError
– if closed or terminate event is missing.
-
IsEmpty
()[source]¶ Checks if the queue is empty.
ZeroMQ queues don’t have a concept of “empty” - there could always be messages on the queue that a producer or consumer is unaware of. Thus, the queue is never empty, so we return False. Note that it is possible that a queue is unable to pop an item from a queue within a timeout, which will cause PopItem to raise a QueueEmpty exception, but this is a different condition.
Returns: False, to indicate the the queue isn’t empty. Return type: bool
-
Open
()[source]¶ Opens this queue, causing the creation of a ZeroMQ socket.
Raises: QueueAlreadyStarted
– if the queue is already started, and a socket already exists.
-
PopItem
()[source]¶ Pops an item off the queue.
Returns: item from the queue. Return type: object Raises: QueueEmpty
– if the queue is empty, and no item could be popped within the queue timeout.
-
PushItem
(item, block=True)[source]¶ Pushes an item on to the queue.
Parameters: - item (object) – item to push on the queue.
- block (Optional[bool]) – whether the push should be performed in blocking or non-blocking mode.
Raises: QueueAlreadyClosed
– if the queue is closed.
-
SOCKET_CONNECTION_BIND
= 1¶
-
SOCKET_CONNECTION_CONNECT
= 2¶
-
SOCKET_CONNECTION_TYPE
= None¶
-
-
class
plaso.engine.zeromq_queue.
ZeroMQRequestConnectQueue
(delay_open=True, linger_seconds=10, maximum_items=1000, name='Unnamed', port=None, timeout_seconds=5)[source]¶ Bases:
plaso.engine.zeromq_queue.ZeroMQRequestQueue
A Plaso queue backed by a ZeroMQ REQ socket that connects to a port.
This queue may only be used to pop items, not to push.
-
SOCKET_CONNECTION_TYPE
= 2¶
-
-
class
plaso.engine.zeromq_queue.
ZeroMQRequestQueue
(delay_open=True, linger_seconds=10, maximum_items=1000, name='Unnamed', port=None, timeout_seconds=5)[source]¶ Bases:
plaso.engine.zeromq_queue.ZeroMQQueue
Parent class for Plaso queues backed by ZeroMQ REQ sockets.
This class should not be instantiated directly, a subclass should be instantiated instead.
Instances of this class or subclasses may only be used to pop items, not to push.
-
PopItem
()[source]¶ Pops an item off the queue.
If no ZeroMQ socket has been created, one will be created the first time this method is called.
Returns: item from the queue.
Return type: object
Raises: KeyboardInterrupt
– if the process is sent a KeyboardInterrupt while popping an item.QueueEmpty
– if the queue is empty, and no item could be popped within the queue timeout.RuntimeError
– if terminate event is missing.zmq.error.ZMQError
– if an error occurs in ZeroMQ.
-
PushItem
(item, block=True)[source]¶ Pushes an item on to the queue.
Provided for compatibility with the API, but doesn’t actually work.
Parameters: - item (object) – item to push on the queue.
- block (Optional[bool]) – whether the push should be performed in blocking or non-blocking mode.
Raises: WrongQueueType
– As Push is not supported this queue.
-