plaso.engine package

Submodules

plaso.engine.artifact_filters module

Helper to create filters based on forensic artifact definitions.

class plaso.engine.artifact_filters.ArtifactDefinitionsFilterHelper(artifacts_registry, artifact_filters, knowledge_base)[source]

Bases: object

Helper to create filters based on artifact definitions.

Builds extraction filters from forensic artifact definitions.

For more information about Forensic Artifacts see: https://github.com/ForensicArtifacts/artifacts/blob/master/docs/Artifacts%20definition%20format%20and%20style%20guide.asciidoc

BuildFindSpecs(environment_variables=None)[source]

Builds find specifications from artifact definitions.

The resulting find specifications are set in the knowledge base.

Parameters:environment_variables (Optional[list[EnvironmentVariableArtifact]]) – environment variables.
BuildFindSpecsFromFileArtifact(source_path, path_separator, environment_variables, user_accounts)[source]

Builds find specifications from a file source type.

Parameters:
  • source_path (str) – file system path defined by the source.
  • path_separator (str) – file system path segment separator.
  • environment_variables (list[str]) – environment variable attributes used to dynamically populate environment variables in key.
  • user_accounts (list[str]) – identified user accounts stored in the knowledge base.
Returns:

find specifications for the file source type.

Return type:

list[dfvfs.FindSpec]

BuildFindSpecsFromRegistryArtifact(source_key_path)[source]

Build find specifications from a Windows Registry source type.

Parameters:source_key_path (str) – Windows Registry key path defined by the source.
Returns:
find specifications for the Windows Registry
source type.
Return type:list[dfwinreg.FindSpec]
static CheckKeyCompatibility()[source]

Checks if a Windows Registry key path is supported by dfWinReg.

Parameters:key_path (str) – path of the Windows Registry key.
Returns:True if key is compatible or False if not.
Return type:bool
KNOWLEDGE_BASE_VALUE = 'ARTIFACT_FILTERS'

plaso.engine.configurations module

Processing configuration classes.

class plaso.engine.configurations.CredentialConfiguration(credential_data=None, credential_type=None, path_spec=None)[source]

Bases: plaso.containers.interface.AttributeContainer

Configuration settings for a credential.

credential_data

bytes – credential data.

credential_type

str – credential type.

path_spec

dfvfs.PathSpec – path specification.

CONTAINER_TYPE = 'credential_configuration'
class plaso.engine.configurations.EventExtractionConfiguration[source]

Bases: plaso.containers.interface.AttributeContainer

Configuration settings for event extraction.

These settings are primarily used by the parser mediator.

filter_object

objectfilter.Filter – filter that specifies which events to include.

text_prepend

str – text to prepend to every event.

CONTAINER_TYPE = 'event_extraction_configuration'
class plaso.engine.configurations.ExtractionConfiguration[source]

Bases: plaso.containers.interface.AttributeContainer

Configuration settings for extraction.

These settings are primarily used by the extraction worker.

hasher_file_size_limit

int – maximum file size that hashers should process, where 0 or None represents unlimited.

hasher_names_string

str – comma separated string of names of hashers to use during processing.

process_archives

bool – True if archive files should be scanned for file entries.

process_compressed_streams

bool – True if file content in compressed streams should be processed.

yara_rules_string

str – Yara rule definitions.

CONTAINER_TYPE = 'extraction_configuration'
class plaso.engine.configurations.InputSourceConfiguration[source]

Bases: plaso.containers.interface.AttributeContainer

Configuration settings of an input source.

mount_path

str – path of a “mounted” directory input source.

CONTAINER_TYPE = 'input_source'
class plaso.engine.configurations.ProcessingConfiguration[source]

Bases: plaso.containers.interface.AttributeContainer

Configuration settings for processing.

artifact_filters

Optional list[str] – names of artifact definitions that are used for filtering file system and Windows Registry key paths.

credentials

list[CredentialConfiguration] – credential configurations.

data_location

str – path to the data files.

debug_output

bool – True if debug output should be enabled.

event_extraction

EventExtractionConfiguration – event extraction configuration.

extraction

ExtractionConfiguration – extraction configuration.

filter_file

str – path to a file with find specifications.

input_source

InputSourceConfiguration – input source configuration.

log_filename

str – name of the log file.

parser_filter_expression

str – parser filter expression, where None represents all parsers and plugins.

preferred_year

int – preferred initial year value for year-less date and time values.

profiling

ProfilingConfiguration – profiling configuration.

temporary_directory

str – path of the directory for temporary files.

CONTAINER_TYPE = 'processing_configuration'
class plaso.engine.configurations.ProfilingConfiguration[source]

Bases: plaso.containers.interface.AttributeContainer

Configuration settings for profiling.

directory

str – path to the directory where the profiling sample files should be stored.

profilers

set(str) – names of the profilers to enable. Supported profilers are:

  • ‘guppy’, which profiles memory usage using guppy;
  • ‘memory’, which profiles memory usage;
  • ‘parsers’, which profiles CPU time consumed by individual parsers;
  • ‘processing’, which profiles CPU time consumed by different parts of processing;
  • ‘serializers’, which profiles CPU time consumed by individual serializers.
  • ‘storage’, which profiles storage reads and writes.
sample_rate

int – the profiling sample rate. Contains the number of event sources processed.

CONTAINER_TYPE = 'profiling_configuration'
HaveProfileMemory()[source]

Determines if memory profiling is configured.

Returns:True if memory profiling is configured.
Return type:bool
HaveProfileMemoryGuppy()[source]

Determines if memory profiling with guppy is configured.

Returns:True if memory profiling with guppy is configured.
Return type:bool
HaveProfileParsers()[source]

Determines if parsers profiling is configured.

Returns:True if parsers profiling is configured.
Return type:bool
HaveProfileProcessing()[source]

Determines if processing profiling is configured.

Returns:True if processing profiling is configured.
Return type:bool
HaveProfileSerializers()[source]

Determines if serializers profiling is configured.

Returns:True if serializers profiling is configured.
Return type:bool
HaveProfileStorage()[source]

Determines if storage profiling is configured.

Returns:True if storage profiling is configured.
Return type:bool
HaveProfileTaskQueue()[source]

Determines if task queue profiling is configured.

Returns:True if task queue profiling is configured.
Return type:bool
HaveProfileTasks()[source]

Determines if tasks profiling is configured.

Returns:True if task queue profiling is configured.
Return type:bool

plaso.engine.engine module

The processing engine.

class plaso.engine.engine.BaseEngine[source]

Bases: object

Processing engine interface.

knowledge_base

KnowledgeBase – knowledge base.

classmethod BuildArtifactsRegistry(artifact_definitions_path, custom_artifacts_path)[source]

Build Find Specs from artifacts or filter file if available.

Parameters:
  • artifact_definitions_path (str) – path to artifact definitions file.
  • custom_artifacts_path (str) – path to custom artifact definitions file.
Returns:

artifact definitions registry.

Return type:

artifacts.ArtifactDefinitionsRegistry

Raises:

RuntimeError – if no valid FindSpecs are built.

classmethod BuildFilterFindSpecs(artifact_definitions_path, custom_artifacts_path, knowledge_base_object, artifact_filter_names=None, filter_file_path=None)[source]

Builds find specifications from artifacts or filter file if available.

Parameters:
  • artifact_definitions_path (str) – path to artifact definitions file.
  • custom_artifacts_path (str) – path to custom artifact definitions file.
  • knowledge_base_object (KnowledgeBase) – knowledge base.
  • artifact_filter_names (Optional[list[str]]) – names of artifact definitions that are used for filtering file system and Windows Registry key paths.
  • filter_file_path (Optional[str]) – Path of filter file.
Returns:

find specifications for the file source type.

Return type:

list[dfvfs.FindSpec]

Raises:

RuntimeError – if no valid FindSpecs are built.

classmethod CreateSession(artifact_filter_names=None, command_line_arguments=None, debug_mode=False, filter_file_path=None, preferred_encoding='utf-8', preferred_time_zone=None, preferred_year=None)[source]

Creates a session attribute container.

Parameters:
  • artifact_filter_names (Optional[list[str]]) – names of artifact definitions that are used for filtering file system and Windows Registry key paths.
  • command_line_arguments (Optional[str]) – the command line arguments.
  • debug_mode (bool) – True if debug mode was enabled.
  • filter_file_path (Optional[str]) – path to a file with find specifications.
  • preferred_encoding (Optional[str]) – preferred encoding.
  • preferred_time_zone (Optional[str]) – preferred time zone.
  • preferred_year (Optional[int]) – preferred year.
Returns:

session attribute container.

Return type:

Session

GetSourceFileSystem(source_path_spec, resolver_context=None)[source]

Retrieves the file system of the source.

Parameters:
  • source_path_spec (dfvfs.PathSpec) – path specifications of the sources to process.
  • resolver_context (dfvfs.Context) – resolver context.
Returns:

containing:

dfvfs.FileSystem: file system path.PathSpec: mount point path specification. The mount point path

specification refers to either a directory or a volume on a storage media device or image. It is needed by the dfVFS file system searcher (FileSystemSearcher) to indicate the base location of the file system.

Return type:

tuple

Raises:

RuntimeError – if source file system path specification is not set.

PreprocessSources(artifacts_registry_object, source_path_specs, resolver_context=None)[source]

Preprocesses the sources.

Parameters:
  • artifacts_registry_object (artifacts.ArtifactDefinitionsRegistry) – artifact definitions registry.
  • source_path_specs (list[dfvfs.PathSpec]) – path specifications of the sources to process.
  • resolver_context (Optional[dfvfs.Context]) – resolver context.
classmethod SupportsGuppyMemoryProfiling()[source]

Determines if memory profiling with guppy is supported.

Returns:True if memory profiling with guppy is supported.
Return type:bool

plaso.engine.extractors module

The extractor class definitions.

An extractor is a class used to extract information from “raw” data.

class plaso.engine.extractors.EventExtractor(parser_filter_expression=None)[source]

Bases: object

Event extractor.

An event extractor extracts events from event sources.

ParseDataStream(parser_mediator, file_entry, data_stream_name)[source]

Parses a data stream of a file entry with the enabled parsers.

Parameters:
  • parser_mediator (ParserMediator) – parser mediator.
  • file_entry (dfvfs.FileEntry) – file entry.
  • data_stream_name (str) – data stream name.
Raises:

RuntimeError – if the file-like object or the parser object is missing.

ParseFileEntryMetadata(parser_mediator, file_entry)[source]

Parses the file entry metadata e.g. file system data.

Parameters:
  • parser_mediator (ParserMediator) – parser mediator.
  • file_entry (dfvfs.FileEntry) – file entry.
ParseMetadataFile(parser_mediator, file_entry, data_stream_name)[source]

Parses a metadata file.

Parameters:
  • parser_mediator (ParserMediator) – parser mediator.
  • file_entry (dfvfs.FileEntry) – file entry.
  • data_stream_name (str) – data stream name.
class plaso.engine.extractors.PathSpecExtractor(duplicate_file_check=False)[source]

Bases: object

Path specification extractor.

A path specification extractor extracts path specification from a source directory, file or storage media device or image.

ExtractPathSpecs(path_specs, find_specs=None, recurse_file_system=True, resolver_context=None)[source]

Extracts path specification from a specific source.

Parameters:
  • path_specs (Optional[list[dfvfs.PathSpec]]) – path specifications.
  • find_specs (Optional[list[dfvfs.FindSpec]]) – find specifications.
  • recurse_file_system (Optional[bool]) – True if extraction should recurse into a file system.
  • resolver_context (Optional[dfvfs.Context]) – resolver context.
Yields:

dfvfs.PathSpec – path specification of a file entry found in the source.

plaso.engine.filter_file module

Filter file.

class plaso.engine.filter_file.FilterFile(path)[source]

Bases: object

Filter file.

A filter file contains one or more path filters.

A path filter may contain path expansion attributes. Such an attribute is defined as anything within a curly bracket, for example “System{my_attribute}PathKeyname”. If the attribute “my_attribute” is defined its runtime value will be replaced with placeholder in the path filter such as “SystemMyValuePathKeyname”.

If the path filter needs to have curly brackets in the path then these need to be escaped with another curly bracket, for example “System{my_attribute}{{123-AF25-E523}}KeyName”, where “{{123-AF25-E523}}” will be replaced with “{123-AF25-E523}” at runtime.

BuildFindSpecs(environment_variables=None)[source]

Build find specification from a filter file.

Parameters:environment_variables (Optional[list[EnvironmentVariableArtifact]]) – environment variables.
Returns:find specification.
Return type:list[dfvfs.FindSpec]

plaso.engine.knowledge_base module

The artifact knowledge base object.

The knowledge base is filled by user provided input and the pre-processing phase. It is intended to provide successive phases, like the parsing and analysis phases, with essential information like e.g. the timezone and codepage of the source data.

class plaso.engine.knowledge_base.KnowledgeBase[source]

Bases: object

Class that implements the artifact knowledge base.

AddEnvironmentVariable(environment_variable)[source]

Adds an environment variable.

Parameters:environment_variable (EnvironmentVariableArtifact) – environment variable artifact.
Raises:KeyError – if the environment variable already exists.
AddUserAccount(user_account, session_identifier=0)[source]

Adds an user account.

Parameters:
  • user_account (UserAccountArtifact) – user account artifact.
  • session_identifier (Optional[str])) – session identifier, where CURRENT_SESSION represents the active session.
Raises:

KeyError – if the user account already exists.

CURRENT_SESSION = 0
GetEnvironmentVariable(name)[source]

Retrieves an environment variable.

Parameters:name (str) – name of the environment variable.
Returns:
environment variable artifact or None
if there was no value set for the given name.
Return type:EnvironmentVariableArtifact
GetEnvironmentVariables()[source]

Retrieves the environment variables.

Returns:environment variable artifacts.
Return type:list[EnvironmentVariableArtifact]
GetHostname(session_identifier=0)[source]

Retrieves the hostname related to the event.

If the hostname is not stored in the event it is determined based on the preprocessing information that is stored inside the storage file.

Parameters:session_identifier (Optional[str])) – session identifier, where CURRENT_SESSION represents the active session.
Returns:hostname.
Return type:str
GetStoredHostname()[source]

Retrieves the stored hostname.

The hostname is determined based on the preprocessing information that is stored inside the storage file.

Returns:hostname.
Return type:str
GetSystemConfigurationArtifact(session_identifier=0)[source]

Retrieves the knowledge base as a system configuration artifact.

Parameters:session_identifier (Optional[str])) – session identifier, where CURRENT_SESSION represents the active session.
Returns:system configuration artifact.
Return type:SystemConfigurationArtifact
GetUsernameByIdentifier(user_identifier, session_identifier=0)[source]

Retrieves the username based on an user identifier.

Parameters:
  • user_identifier (str) – user identifier, either a UID or SID.
  • session_identifier (Optional[str])) – session identifier, where CURRENT_SESSION represents the active session.
Returns:

username.

Return type:

str

GetUsernameForPath(path)[source]

Retrieves a username for a specific path.

This is determining if a specific path is within a user’s directory and returning the username of the user if so.

Parameters:path (str) – path.
Returns:
username or None if the path does not appear to be within a user’s
directory.
Return type:str
GetValue(identifier, default_value=None)[source]

Retrieves a value by identifier.

Parameters:
  • identifier (str) – case insensitive unique identifier for the value.
  • default_value (object) – default value.
Returns:

value or default value if not available.

Return type:

object

Raises:

TypeError – if the identifier is not a string type.

HasUserAccounts()[source]

Determines if the knowledge base contains user accounts.

Returns:True if the knowledge base contains user accounts.
Return type:bool
ReadSystemConfigurationArtifact(system_configuration, session_identifier=0)[source]

Reads the knowledge base values from a system configuration artifact.

Note that this overwrites existing values in the knowledge base.

Parameters:
  • system_configuration (SystemConfigurationArtifact) – system configuration artifact.
  • session_identifier (Optional[str])) – session identifier, where CURRENT_SESSION represents the active session.
SetCodepage(codepage)[source]

Sets the codepage.

Parameters:codepage (str) – codepage.
Raises:ValueError – if the codepage is not supported.
SetEnvironmentVariable(environment_variable)[source]

Sets an environment variable.

Parameters:environment_variable (EnvironmentVariableArtifact) – environment variable artifact.
SetHostname(hostname, session_identifier=0)[source]

Sets a hostname.

Parameters:
  • hostname (HostnameArtifact) – hostname artifact.
  • session_identifier (Optional[str])) – session identifier, where CURRENT_SESSION represents the active session.
SetTimeZone(time_zone)[source]

Sets the time zone.

Parameters:time_zone (str) – time zone.
Raises:ValueError – if the timezone is not supported.
SetValue(identifier, value)[source]

Sets a value by identifier.

Parameters:
  • identifier (str) – case insensitive unique identifier for the value.
  • value (object) – value.
Raises:

TypeError – if the identifier is not a string type.

codepage

str – codepage of the current session.

hostname

str – hostname of the current session.

timezone

datetime.tzinfo – timezone of the current session.

user_accounts

list[UserAccountArtifact] – user accounts of the current session.

year

int – year of the current session.

plaso.engine.logger module

The engine sub module logger.

plaso.engine.path_helper module

The path helper.

class plaso.engine.path_helper.PathHelper[source]

Bases: object

Class that implements the path helper.

classmethod AppendPathEntries(path, path_separator, count, skip_first)[source]

Appends wildcard entries to end of path.

Will append wildcard * to given path building a list of strings for “count” iterations, skipping the first directory if skip_first is true.

Parameters:
  • path (str) – Path to append wildcards to.
  • path_separator (str) – path segment separator.
  • count (int) – Number of entries to be appended.
  • skip_first (bool) – Whether or not to skip first entry to append.
Returns:

Paths that were expanded from the path with wildcards.

Return type:

list[str]

classmethod ExpandRecursiveGlobs(path, path_separator)[source]

Expands recursive like globs present in an artifact path.

If a path ends in ‘**’, with up to two optional digits such as ‘10’, the ‘’ will recursively match all files and zero or more directories from the specified path. The optional digits indicate the recursion depth. By default recursion depth is 10 directories.

If the glob is followed by the specified path segment separator, only directories and subdirectories will be matched.

Parameters:
  • path (str) – path to be expanded.
  • path_separator (str) – path segment separator.
Returns:

String path expanded for each glob.

Return type:

list[str]

classmethod ExpandUsersHomeDirectoryPath(path, user_accounts)[source]

Expands a path to contain all users home or profile directories.

Expands the GRR artifacts path variable “%%users.homedir%%”.

Parameters:
  • path (str) – Windows path with environment variables.
  • user_accounts (list[UserAccountArtifact]) – user accounts.
Returns:

paths returned for user accounts without a drive letter.

Return type:

list[str]

classmethod ExpandWindowsPath(path, environment_variables)[source]

Expands a Windows path containing environment variables.

Parameters:
  • path (str) – Windows path with environment variables.
  • environment_variables (list[EnvironmentVariableArtifact]) – environment variables.
Returns:

expanded Windows path.

Return type:

str

classmethod GetDisplayNameForPathSpec(path_spec, mount_path=None, text_prepend=None)[source]

Retrieves the display name of a path specification.

Parameters:
  • path_spec (dfvfs.PathSpec) – path specification.
  • mount_path (Optional[str]) – path where the file system that is used by the path specification is mounted, such as “/mnt/image”. The mount path will be stripped from the absolute path defined by the path specification.
  • text_prepend (Optional[str]) – text to prepend.
Returns:

human readable version of the path specification or None.

Return type:

str

classmethod GetRelativePathForPathSpec(path_spec, mount_path=None)[source]

Retrieves the relative path of a path specification.

If a mount path is defined the path will be relative to the mount point, otherwise the path is relative to the root of the file system that is used by the path specification.

Parameters:
  • path_spec (dfvfs.PathSpec) – path specification.
  • mount_path (Optional[str]) – path where the file system that is used by the path specification is mounted, such as “/mnt/image”. The mount path will be stripped from the absolute path defined by the path specification.
Returns:

relative path or None.

Return type:

str

plaso.engine.plaso_queue module

Queue management implementation for Plaso.

This file contains an implementation of a queue used by plaso for queue management.

The queue has been abstracted in order to provide support for different implementations of the queueing mechanism, to support multi processing and scalability.

class plaso.engine.plaso_queue.Queue[source]

Bases: object

Class that implements the queue interface.

Close(abort=False)[source]

Closes the queue.

Parameters:abort (Optional[bool]) – whether the Close is the result of an abort condition. If True, queue contents may be lost.
IsEmpty()[source]

Determines if the queue is empty.

Open()[source]

Opens the queue, ready to enqueue or dequeue items.

PopItem()[source]

Pops an item off the queue.

Raises:QueueEmpty – when the queue is empty.
PushItem(item, block=True)[source]

Pushes an item onto the queue.

Parameters:
  • item (object) – item to add.
  • block (bool) – whether to block if the queue is full.
Raises:

QueueFull – if the queue is full, and the item could not be added.

class plaso.engine.plaso_queue.QueueAbort[source]

Bases: object

Class that implements a queue abort.

plaso.engine.process_info module

Information about running process.

class plaso.engine.process_info.ProcessInfo(pid)[source]

Bases: object

Provides information about a running process.

GetUsedMemory()[source]

Retrieves the amount of memory used by the process.

Returns:
amount of memory in bytes used by the process or None
if not available.
Return type:int

plaso.engine.processing_status module

Processing status classes.

class plaso.engine.processing_status.ProcessStatus[source]

Bases: object

The status of an individual process.

display_name

str – human readable of the file entry currently being processed by the process.

identifier

str – process identifier.

last_running_time

int – timestamp of the last update when the process had a running process status.

number_of_consumed_errors

int – total number of errors consumed by the process.

number_of_consumed_errors_delta

int – number of errors consumed by the process since the last status update.

number_of_consumed_event_tags

int – total number of event tags consumed by the process.

number_of_consumed_event_tags_delta

int – number of event tags consumed by the process since the last status update.

number_of_consumed_events

int – total number of events consumed by the process.

number_of_consumed_events_delta

int – number of events consumed by the process since the last status update.

number_of_consumed_reports

int – total number of event reports consumed by the process.

number_of_consumed_reports_delta

int – number of event reports consumed by the process since the last status update.

number_of_consumed_sources

int – total number of event sources consumed by the process.

number_of_consumed_sources_delta

int – number of event sources consumed by the process since the last status update.

number_of_produced_errors

int – total number of errors produced by the process.

number_of_produced_errors_delta

int – number of errors produced by the process since the last status update.

number_of_produced_event_tags

int – total number of event tags produced by the process.

number_of_produced_event_tags_delta

int – number of event tags produced by the process since the last status update.

number_of_produced_events

int – total number of events produced by the process.

number_of_produced_events_delta

int – number of events produced by the process since the last status update.

number_of_produced_reports

int – total number of event reports produced by the process.

number_of_produced_reports_delta

int – number of event reports produced by the process since the last status update.

number_of_produced_sources

int – total number of event sources produced by the process.

number_of_produced_sources_delta

int – number of event sources produced by the process since the last status update.

pid

int – process identifier (PID).

status

str – human readable status indication e.g. ‘Hashing’, ‘Idle’.

used_memory

int – size of used memory in bytes.

UpdateNumberOfErrors(number_of_consumed_errors, number_of_produced_errors)[source]

Updates the number of errors.

Parameters:
  • number_of_consumed_errors (int) – total number of errors consumed by the process.
  • number_of_produced_errors (int) – total number of errors produced by the process.
Returns:

True if either number of errors has increased.

Return type:

bool

Raises:

ValueError – if the consumed or produced number of errors is smaller than the value of the previous update.

UpdateNumberOfEventReports(number_of_consumed_reports, number_of_produced_reports)[source]

Updates the number of event reports.

Parameters:
  • number_of_consumed_reports (int) – total number of event reports consumed by the process.
  • number_of_produced_reports (int) – total number of event reports produced by the process.
Returns:

True if either number of event reports has increased.

Return type:

bool

Raises:

ValueError – if the consumed or produced number of event reports is smaller than the value of the previous update.

UpdateNumberOfEventSources(number_of_consumed_sources, number_of_produced_sources)[source]

Updates the number of event sources.

Parameters:
  • number_of_consumed_sources (int) – total number of event sources consumed by the process.
  • number_of_produced_sources (int) – total number of event sources produced by the process.
Returns:

True if either number of event sources has increased.

Return type:

bool

Raises:

ValueError – if the consumed or produced number of event sources is smaller than the value of the previous update.

UpdateNumberOfEventTags(number_of_consumed_event_tags, number_of_produced_event_tags)[source]

Updates the number of event tags.

Parameters:
  • number_of_consumed_event_tags (int) – total number of event tags consumed by the process.
  • number_of_produced_event_tags (int) – total number of event tags produced by the process.
Returns:

True if either number of event tags has increased.

Return type:

bool

Raises:

ValueError – if the consumed or produced number of event tags is smaller than the value of the previous update.

UpdateNumberOfEvents(number_of_consumed_events, number_of_produced_events)[source]

Updates the number of events.

Parameters:
  • number_of_consumed_events (int) – total number of events consumed by the process.
  • number_of_produced_events (int) – total number of events produced by the process.
Returns:

True if either number of events has increased.

Return type:

bool

Raises:

ValueError – if the consumed or produced number of events is smaller than the value of the previous update.

class plaso.engine.processing_status.ProcessingStatus[source]

Bases: object

The status of the overall extraction process (processing).

aborted

bool – True if processing was aborted.

error_path_specs

list[dfvfs.PathSpec] – path specifications that caused critical errors during processing.

foreman_status

ProcessingStatus – foreman processing status.

start_time

float – time that the processing was started. Contains the number of micro seconds since January 1, 1970, 00:00:00 UTC.

tasks_status

TasksStatus – status information about tasks.

UpdateForemanStatus(identifier, status, pid, used_memory, display_name, number_of_consumed_sources, number_of_produced_sources, number_of_consumed_events, number_of_produced_events, number_of_consumed_event_tags, number_of_produced_event_tags, number_of_consumed_errors, number_of_produced_errors, number_of_consumed_reports, number_of_produced_reports)[source]

Updates the status of the foreman.

Parameters:
  • identifier (str) – foreman identifier.
  • status (str) – human readable status of the foreman e.g. ‘Idle’.
  • pid (int) – process identifier (PID).
  • used_memory (int) – size of used memory in bytes.
  • display_name (str) – human readable of the file entry currently being processed by the foreman.
  • number_of_consumed_sources (int) – total number of event sources consumed by the foreman.
  • number_of_produced_sources (int) – total number of event sources produced by the foreman.
  • number_of_consumed_events (int) – total number of events consumed by the foreman.
  • number_of_produced_events (int) – total number of events produced by the foreman.
  • number_of_consumed_event_tags (int) – total number of event tags consumed by the foreman.
  • number_of_produced_event_tags (int) – total number of event tags produced by the foreman.
  • number_of_consumed_errors (int) – total number of errors consumed by the foreman.
  • number_of_produced_errors (int) – total number of errors produced by the foreman.
  • number_of_consumed_reports (int) – total number of event reports consumed by the process.
  • number_of_produced_reports (int) – total number of event reports produced by the process.
UpdateTasksStatus(tasks_status)[source]

Updates the tasks status.

Parameters:tasks_status (TasksStatus) – status information about tasks.
UpdateWorkerStatus(identifier, status, pid, used_memory, display_name, number_of_consumed_sources, number_of_produced_sources, number_of_consumed_events, number_of_produced_events, number_of_consumed_event_tags, number_of_produced_event_tags, number_of_consumed_errors, number_of_produced_errors, number_of_consumed_reports, number_of_produced_reports)[source]

Updates the status of a worker.

Parameters:
  • identifier (str) – worker identifier.
  • status (str) – human readable status of the worker e.g. ‘Idle’.
  • pid (int) – process identifier (PID).
  • used_memory (int) – size of used memory in bytes.
  • display_name (str) – human readable of the file entry currently being processed by the worker.
  • number_of_consumed_sources (int) – total number of event sources consumed by the worker.
  • number_of_produced_sources (int) – total number of event sources produced by the worker.
  • number_of_consumed_events (int) – total number of events consumed by the worker.
  • number_of_produced_events (int) – total number of events produced by the worker.
  • number_of_consumed_event_tags (int) – total number of event tags consumed by the worker.
  • number_of_produced_event_tags (int) – total number of event tags produced by the worker.
  • number_of_consumed_errors (int) – total number of errors consumed by the worker.
  • number_of_produced_errors (int) – total number of errors produced by the worker.
  • number_of_consumed_reports (int) – total number of event reports consumed by the process.
  • number_of_produced_reports (int) – total number of event reports produced by the process.
workers_status

The worker status objects sorted by identifier.

class plaso.engine.processing_status.TasksStatus[source]

Bases: object

The status of the tasks.

number_of_abandoned_tasks

int – number of abandoned tasks.

number_of_queued_tasks

int – number of active tasks.

number_of_tasks_pending_merge

int – number of tasks pending merge.

number_of_tasks_processing

int – number of tasks processing.

total_number_of_tasks

int – total number of tasks.

plaso.engine.profilers module

The profiler classes.

class plaso.engine.profilers.CPUTimeMeasurement[source]

Bases: object

The CPU time measurement.

start_sample_time

float – start sample time or None if not set.

total_cpu_time

float – total CPU time or None if not set.

SampleStart()[source]

Starts measuring the CPU time.

SampleStop()[source]

Stops measuring the CPU time.

class plaso.engine.profilers.CPUTimeProfiler(identifier, configuration)[source]

Bases: plaso.engine.profilers.SampleFileProfiler

The CPU time profiler.

StartTiming(profile_name)[source]

Starts timing CPU time.

Parameters:profile_name (str) – name of the profile to sample.
StopTiming(profile_name)[source]

Stops timing CPU time.

Parameters:profile_name (str) – name of the profile to sample.
class plaso.engine.profilers.GuppyMemoryProfiler(identifier, configuration)[source]

Bases: object

The guppy-based memory profiler.

classmethod IsSupported()[source]

Determines if the profiler is supported.

Returns:True if the profiler is supported.
Return type:bool
Sample()[source]

Takes a sample for profiling.

Start()[source]

Starts the profiler.

Stop()[source]

Stops the profiler.

class plaso.engine.profilers.MemoryProfiler(identifier, configuration)[source]

Bases: plaso.engine.profilers.SampleFileProfiler

The memory profiler.

Sample(profile_name, used_memory)[source]

Takes a sample for profiling.

Parameters:
  • profile_name (str) – name of the profile to sample.
  • used_memory (int) – amount of used memory in bytes.
class plaso.engine.profilers.ProcessingProfiler(identifier, configuration)[source]

Bases: plaso.engine.profilers.CPUTimeProfiler

The processing profiler.

class plaso.engine.profilers.SampleFileProfiler(identifier, configuration)[source]

Bases: object

Shared functionality for sample file-based profilers.

classmethod IsSupported()[source]

Determines if the profiler is supported.

Returns:True if the profiler is supported.
Return type:bool
Start()[source]

Starts the profiler.

Stop()[source]

Stops the profiler.

class plaso.engine.profilers.SerializersProfiler(identifier, configuration)[source]

Bases: plaso.engine.profilers.CPUTimeProfiler

The serializers profiler.

class plaso.engine.profilers.StorageProfiler(identifier, configuration)[source]

Bases: plaso.engine.profilers.SampleFileProfiler

The storage profiler.

Sample(operation, description, data_size, compressed_data_size)[source]

Takes a sample of data read or written for profiling.

Parameters:
  • operation (str) – operation, either ‘read’ or ‘write’.
  • description (str) – description of the data read.
  • data_size (int) – size of the data read in bytes.
  • compressed_data_size (int) – size of the compressed data read in bytes.
class plaso.engine.profilers.TaskQueueProfiler(identifier, configuration)[source]

Bases: plaso.engine.profilers.SampleFileProfiler

The task queue profiler.

Sample(tasks_status)[source]

Takes a sample of the status of queued tasks for profiling.

Parameters:tasks_status (TasksStatus) – status information about tasks.
class plaso.engine.profilers.TasksProfiler(identifier, configuration)[source]

Bases: plaso.engine.profilers.SampleFileProfiler

The tasks profiler.

Sample(task, status)[source]

Takes a sample of the status of a task for profiling.

Parameters:
  • task (Task) – a task.
  • status (str) – status.

plaso.engine.single_process module

The single process processing engine.

class plaso.engine.single_process.SingleProcessEngine[source]

Bases: plaso.engine.engine.BaseEngine

Class that defines the single process engine.

ProcessSources(source_path_specs, storage_writer, resolver_context, processing_configuration, filter_find_specs=None, status_update_callback=None)[source]

Processes the sources.

Parameters:
  • source_path_specs (list[dfvfs.PathSpec]) – path specifications of the sources to process.
  • storage_writer (StorageWriter) – storage writer for a session storage.
  • resolver_context (dfvfs.Context) – resolver context.
  • processing_configuration (ProcessingConfiguration) – processing configuration.
  • filter_find_specs (Optional[list[dfvfs.FindSpec]]) – find specifications used in path specification extraction.
  • status_update_callback (Optional[function]) – callback function for status updates.
Returns:

processing status.

Return type:

ProcessingStatus

plaso.engine.tagging_file module

Tagging file.

class plaso.engine.tagging_file.TaggingFile(path)[source]

Bases: object

Tagging file.

A tagging file contains one or more event tagging rules.

GetEventTaggingRules()[source]

Retrieves the event tagging rules from the tagging file.

Returns:
efilter abstract syntax tree (AST), containing the
tagging rules.
Return type:efilter.ast.Expression

plaso.engine.worker module

The event extraction worker.

class plaso.engine.worker.EventExtractionWorker(parser_filter_expression=None)[source]

Bases: object

Event extraction worker.

The event extraction worker determines which parsers are suitable for parsing a particular file entry or data stream. The parsers extract relevant data from file system and or file content data. All extracted data is passed to the parser mediator for further processing.

last_activity_timestamp

int – timestamp received that indicates the last time activity was observed.

processing_status

str – human readable status indication such as: ‘Extracting’, ‘Hashing’.

GetAnalyzerNames()[source]

Gets the names of the active analyzers.

Returns:names of active analyzers.
Return type:list[str]
ProcessPathSpec(mediator, path_spec)[source]

Processes a path specification.

Parameters:
  • mediator (ParserMediator) – mediates the interactions between parsers and other components, such as storage and abort signals.
  • path_spec (dfvfs.PathSpec) – path specification.
SetExtractionConfiguration(configuration)[source]

Sets the extraction configuration settings.

Parameters:configuration (ExtractionConfiguration) – extraction configuration.
SetProcessingProfiler(processing_profiler)[source]

Sets the parsers profiler.

Parameters:processing_profiler (ProcessingProfiler) – processing profile.
SignalAbort()[source]

Signals the extraction worker to abort.

plaso.engine.zeromq_queue module

ZeroMQ implementations of the Plaso queue interface.

class plaso.engine.zeromq_queue.ZeroMQBufferedQueue(buffer_timeout_seconds=2, buffer_max_size=10000, delay_open=True, linger_seconds=10, maximum_items=1000, name='Unnamed', port=None, timeout_seconds=5)[source]

Bases: plaso.engine.zeromq_queue.ZeroMQQueue

Parent class for buffered Plaso queues.

Buffered queues use a regular Python queue to store items that are pushed or popped from the queue without blocking on underlying ZeroMQ operations.

This class should not be instantiated directly, a subclass should be instantiated instead.

Close(abort=False)[source]

Closes the queue.

Parameters:

abort (Optional[bool]) – whether the Close is the result of an abort condition. If True, queue contents may be lost.

Raises:
  • QueueAlreadyClosed – if the queue is not started, or has already been closed.
  • RuntimeError – if closed or terminate event is missing.
Empty()[source]

Removes all items from the internal buffer.

class plaso.engine.zeromq_queue.ZeroMQBufferedReplyBindQueue(buffer_timeout_seconds=2, buffer_max_size=10000, delay_open=True, linger_seconds=10, maximum_items=1000, name='Unnamed', port=None, timeout_seconds=5)[source]

Bases: plaso.engine.zeromq_queue.ZeroMQBufferedReplyQueue

A Plaso queue backed by a ZeroMQ REP socket that binds to a port.

This queue may only be used to pop items, not to push.

SOCKET_CONNECTION_TYPE = 1
class plaso.engine.zeromq_queue.ZeroMQBufferedReplyQueue(buffer_timeout_seconds=2, buffer_max_size=10000, delay_open=True, linger_seconds=10, maximum_items=1000, name='Unnamed', port=None, timeout_seconds=5)[source]

Bases: plaso.engine.zeromq_queue.ZeroMQBufferedQueue

Parent class for buffered Plaso queues backed by ZeroMQ REP sockets.

This class should not be instantiated directly, a subclass should be instantiated instead.

Instances of this class or subclasses may only be used to push items, not to pop.

PopItem()[source]

Pops an item of the queue.

Provided for compatibility with the API, but doesn’t actually work.

Raises:WrongQueueType – As Pop is not supported by this queue.
PushItem(item, block=True)[source]

Push an item on to the queue.

If no ZeroMQ socket has been created, one will be created the first time this method is called.

Parameters:
  • item (object) – item to push on the queue.
  • block (Optional[bool]) – whether the push should be performed in blocking or non-blocking mode.
Raises:
  • QueueAlreadyClosed – if the queue is closed.
  • QueueFull – if the internal buffer was full and it was not possible to push the item to the buffer within the timeout.
  • RuntimeError – if closed event is missing.
class plaso.engine.zeromq_queue.ZeroMQPullConnectQueue(delay_open=True, linger_seconds=10, maximum_items=1000, name='Unnamed', port=None, timeout_seconds=5)[source]

Bases: plaso.engine.zeromq_queue.ZeroMQPullQueue

A Plaso queue backed by a ZeroMQ PULL socket that connects to a port.

This queue may only be used to pop items, not to push.

SOCKET_CONNECTION_TYPE = 2
class plaso.engine.zeromq_queue.ZeroMQPullQueue(delay_open=True, linger_seconds=10, maximum_items=1000, name='Unnamed', port=None, timeout_seconds=5)[source]

Bases: plaso.engine.zeromq_queue.ZeroMQQueue

Parent class for Plaso queues backed by ZeroMQ PULL sockets.

This class should not be instantiated directly, a subclass should be instantiated instead.

Instances of this class or subclasses may only be used to pop items, not to push.

PopItem()[source]

Pops an item off the queue.

If no ZeroMQ socket has been created, one will be created the first time this method is called.

Returns:

item from the queue.

Return type:

object

Raises:
  • KeyboardInterrupt – if the process is sent a KeyboardInterrupt while popping an item.
  • QueueEmpty – if the queue is empty, and no item could be popped within the queue timeout.
  • RuntimeError – if closed or terminate event is missing.
  • zmq.error.ZMQError – if a ZeroMQ error occurs.
PushItem(item, block=True)[source]

Pushes an item on to the queue.

Provided for compatibility with the API, but doesn’t actually work.

Parameters:
  • item (object) – item to push on the queue.
  • block (Optional[bool]) – whether the push should be performed in blocking or non-blocking mode.
Raises:

WrongQueueType – As Push is not supported this queue.

class plaso.engine.zeromq_queue.ZeroMQPushBindQueue(delay_open=True, linger_seconds=10, maximum_items=1000, name='Unnamed', port=None, timeout_seconds=5)[source]

Bases: plaso.engine.zeromq_queue.ZeroMQPushQueue

A Plaso queue backed by a ZeroMQ PUSH socket that binds to a port.

This queue may only be used to push items, not to pop.

SOCKET_CONNECTION_TYPE = 1
class plaso.engine.zeromq_queue.ZeroMQPushQueue(delay_open=True, linger_seconds=10, maximum_items=1000, name='Unnamed', port=None, timeout_seconds=5)[source]

Bases: plaso.engine.zeromq_queue.ZeroMQQueue

Parent class for Plaso queues backed by ZeroMQ PUSH sockets.

This class should not be instantiated directly, a subclass should be instantiated instead.

Instances of this class or subclasses may only be used to push items, not to pop.

PopItem()[source]

Pops an item of the queue.

Provided for compatibility with the API, but doesn’t actually work.

Raises:WrongQueueType – As Pull is not supported this queue.
PushItem(item, block=True)[source]

Push an item on to the queue.

If no ZeroMQ socket has been created, one will be created the first time this method is called.

Parameters:
  • item (object) – item to push on the queue.
  • block (Optional[bool]) – whether the push should be performed in blocking or non-blocking mode.
Raises:
  • KeyboardInterrupt – if the process is sent a KeyboardInterrupt while pushing an item.
  • QueueFull – if it was not possible to push the item to the queue within the timeout.
  • RuntimeError – if terminate event is missing.
  • zmq.error.ZMQError – if a ZeroMQ specific error occurs.
class plaso.engine.zeromq_queue.ZeroMQQueue(delay_open=True, linger_seconds=10, maximum_items=1000, name='Unnamed', port=None, timeout_seconds=5)[source]

Bases: plaso.engine.plaso_queue.Queue

Interface for a ZeroMQ backed queue.

name

str – name to identify the queue.

port

int – TCP port that the queue is connected or bound to. If the queue is not yet bound or connected to a port, this value will be None.

timeout_seconds

int – number of seconds that calls to PopItem and PushItem may block for, before returning queue.QueueEmpty.

Close(abort=False)[source]

Closes the queue.

Parameters:

abort (Optional[bool]) – whether the Close is the result of an abort condition. If True, queue contents may be lost.

Raises:
  • QueueAlreadyClosed – if the queue is not started, or has already been closed.
  • RuntimeError – if closed or terminate event is missing.
IsBound()[source]

Checks if the queue is bound to a port.

IsConnected()[source]

Checks if the queue is connected to a port.

IsEmpty()[source]

Checks if the queue is empty.

ZeroMQ queues don’t have a concept of “empty” - there could always be messages on the queue that a producer or consumer is unaware of. Thus, the queue is never empty, so we return False. Note that it is possible that a queue is unable to pop an item from a queue within a timeout, which will cause PopItem to raise a QueueEmpty exception, but this is a different condition.

Returns:False, to indicate the the queue isn’t empty.
Return type:bool
Open()[source]

Opens this queue, causing the creation of a ZeroMQ socket.

Raises:QueueAlreadyStarted – if the queue is already started, and a socket already exists.
PopItem()[source]

Pops an item off the queue.

Returns:item from the queue.
Return type:object
Raises:QueueEmpty – if the queue is empty, and no item could be popped within the queue timeout.
PushItem(item, block=True)[source]

Pushes an item on to the queue.

Parameters:
  • item (object) – item to push on the queue.
  • block (Optional[bool]) – whether the push should be performed in blocking or non-blocking mode.
Raises:

QueueAlreadyClosed – if the queue is closed.

SOCKET_CONNECTION_BIND = 1
SOCKET_CONNECTION_CONNECT = 2
SOCKET_CONNECTION_TYPE = None
class plaso.engine.zeromq_queue.ZeroMQRequestConnectQueue(delay_open=True, linger_seconds=10, maximum_items=1000, name='Unnamed', port=None, timeout_seconds=5)[source]

Bases: plaso.engine.zeromq_queue.ZeroMQRequestQueue

A Plaso queue backed by a ZeroMQ REQ socket that connects to a port.

This queue may only be used to pop items, not to push.

SOCKET_CONNECTION_TYPE = 2
class plaso.engine.zeromq_queue.ZeroMQRequestQueue(delay_open=True, linger_seconds=10, maximum_items=1000, name='Unnamed', port=None, timeout_seconds=5)[source]

Bases: plaso.engine.zeromq_queue.ZeroMQQueue

Parent class for Plaso queues backed by ZeroMQ REQ sockets.

This class should not be instantiated directly, a subclass should be instantiated instead.

Instances of this class or subclasses may only be used to pop items, not to push.

PopItem()[source]

Pops an item off the queue.

If no ZeroMQ socket has been created, one will be created the first time this method is called.

Returns:

item from the queue.

Return type:

object

Raises:
  • KeyboardInterrupt – if the process is sent a KeyboardInterrupt while popping an item.
  • QueueEmpty – if the queue is empty, and no item could be popped within the queue timeout.
  • RuntimeError – if terminate event is missing.
  • zmq.error.ZMQError – if an error occurs in ZeroMQ.
PushItem(item, block=True)[source]

Pushes an item on to the queue.

Provided for compatibility with the API, but doesn’t actually work.

Parameters:
  • item (object) – item to push on the queue.
  • block (Optional[bool]) – whether the push should be performed in blocking or non-blocking mode.
Raises:

WrongQueueType – As Push is not supported this queue.

Module contents