plaso.multi_processing package¶
Submodules¶
plaso.multi_processing.analysis_process module¶
The multi-process analysis process.
-
class
plaso.multi_processing.analysis_process.
AnalysisProcess
(event_queue, storage_writer, knowledge_base, analysis_plugin, processing_configuration, data_location=None, event_filter_expression=None, **kwargs)[source]¶ Bases:
plaso.multi_processing.base_process.MultiProcessBaseProcess
Multi-processing analysis process.
plaso.multi_processing.base_process module¶
Base class for a process used in multi-processing.
-
class
plaso.multi_processing.base_process.
MultiProcessBaseProcess
(processing_configuration, enable_sigsegv_handler=False, **kwargs)[source]¶ Bases:
multiprocessing.context.Process
Multi-processing process interface.
-
rpc_port
¶ int – port number of the process status RPC server.
-
name
¶ str – process name.
-
plaso.multi_processing.engine module¶
The multi-process processing engine.
-
class
plaso.multi_processing.engine.
MultiProcessEngine
[source]¶ Bases:
plaso.engine.engine.BaseEngine
Multi-process engine base.
This class contains functionality to: * monitor and manage worker processes; * retrieve a process status information via RPC; * manage the status update thread.
plaso.multi_processing.logger module¶
The multi-processing sub module logger.
plaso.multi_processing.multi_process_queue module¶
A multiprocessing-backed queue.
-
class
plaso.multi_processing.multi_process_queue.
MultiProcessingQueue
(maximum_number_of_queued_items=0, timeout=None)[source]¶ Bases:
plaso.engine.plaso_queue.Queue
Multi-processing queue.
-
Close
(abort=False)[source]¶ Closes the queue.
This needs to be called from any process or thread putting items onto the queue.
Parameters: abort (Optional[bool]) – True if the close was issued on abort.
-
plaso.multi_processing.plaso_xmlrpc module¶
XML RPC server and client.
-
class
plaso.multi_processing.plaso_xmlrpc.
ThreadedXMLRPCServer
(callback)[source]¶ Bases:
plaso.multi_processing.rpc.RPCServer
Threaded XML RPC server.
-
class
plaso.multi_processing.plaso_xmlrpc.
XMLProcessStatusRPCClient
[source]¶ Bases:
plaso.multi_processing.plaso_xmlrpc.XMLRPCClient
XML process status RPC client.
-
class
plaso.multi_processing.plaso_xmlrpc.
XMLProcessStatusRPCServer
(callback)[source]¶ Bases:
plaso.multi_processing.plaso_xmlrpc.ThreadedXMLRPCServer
XML process status threaded RPC server.
-
class
plaso.multi_processing.plaso_xmlrpc.
XMLRPCClient
[source]¶ Bases:
plaso.multi_processing.rpc.RPCClient
XML RPC client.
plaso.multi_processing.psort module¶
The psort multi-processing engine.
-
class
plaso.multi_processing.psort.
PsortEventHeap
[source]¶ Bases:
object
Psort event heap.
-
PopEvent
()[source]¶ Pops an event from the heap.
Returns: containing: - str: identifier of the event MACB group or None if the event cannot
- be grouped.
str: identifier of the event content. EventObject: event.
Return type: tuple
-
PushEvent
(event)[source]¶ Pushes an event onto the heap.
Parameters: event (EventObject) – event.
-
number_of_events
¶ int – number of events on the heap.
-
-
class
plaso.multi_processing.psort.
PsortMultiProcessEngine
(use_zeromq=True)[source]¶ Bases:
plaso.multi_processing.engine.MultiProcessEngine
Psort multi-processing engine.
-
AnalyzeEvents
(knowledge_base_object, storage_writer, data_location, analysis_plugins, processing_configuration, event_filter=None, event_filter_expression=None, status_update_callback=None, worker_memory_limit=None)[source]¶ Analyzes events in a plaso storage.
Parameters: - knowledge_base_object (KnowledgeBase) – contains information from the source data needed for processing.
- storage_writer (StorageWriter) – storage writer.
- data_location (str) – path to the location that data files should be loaded from.
- analysis_plugins (dict[str, AnalysisPlugin]) – analysis plugins that should be run and their names.
- processing_configuration (ProcessingConfiguration) – processing configuration.
- event_filter (Optional[FilterObject]) – event filter.
- event_filter_expression (Optional[str]) – event filter expression.
- status_update_callback (Optional[function]) – callback function for status updates.
- worker_memory_limit (Optional[int]) – maximum amount of memory a worker is allowed to consume, where None represents the default memory limit and 0 represents no limit.
Raises: KeyboardInterrupt
– if a keyboard interrupt was raised.
-
ExportEvents
(knowledge_base_object, storage_reader, output_module, processing_configuration, deduplicate_events=True, event_filter=None, status_update_callback=None, time_slice=None, use_time_slicer=False)[source]¶ Exports events using an output module.
Parameters: - knowledge_base_object (KnowledgeBase) – contains information from the source data needed for processing.
- storage_reader (StorageReader) – storage reader.
- output_module (OutputModule) – output module.
- processing_configuration (ProcessingConfiguration) – processing configuration.
- deduplicate_events (Optional[bool]) – True if events should be deduplicated.
- event_filter (Optional[FilterObject]) – event filter.
- status_update_callback (Optional[function]) – callback function for status updates.
- time_slice (Optional[TimeSlice]) – slice of time to output.
- use_time_slicer (Optional[bool]) – True if the ‘time slicer’ should be used. The ‘time slicer’ will provide a context of events around an event of interest.
Returns: - counter that tracks the number of events extracted
from storage.
Return type: collections.Counter
-
plaso.multi_processing.rpc module¶
The RPC client and server interface.
-
class
plaso.multi_processing.rpc.
RPCServer
(callback)[source]¶ Bases:
object
RPC server interface.
plaso.multi_processing.task_engine module¶
The task multi-process processing engine.
-
class
plaso.multi_processing.task_engine.
TaskMultiProcessEngine
(maximum_number_of_tasks=10000, use_zeromq=True)[source]¶ Bases:
plaso.multi_processing.engine.MultiProcessEngine
Class that defines the task multi-process engine.
This class contains functionality to: * monitor and manage extraction tasks; * merge results returned by extraction workers.
-
ProcessSources
(session_identifier, source_path_specs, storage_writer, processing_configuration, enable_sigsegv_handler=False, filter_find_specs=None, number_of_worker_processes=0, status_update_callback=None, worker_memory_limit=None)[source]¶ Processes the sources and extract events.
Parameters: - session_identifier (str) – identifier of the session.
- source_path_specs (list[dfvfs.PathSpec]) – path specifications of the sources to process.
- storage_writer (StorageWriter) – storage writer for a session storage.
- processing_configuration (ProcessingConfiguration) – processing configuration.
- enable_sigsegv_handler (Optional[bool]) – True if the SIGSEGV handler should be enabled.
- filter_find_specs (Optional[list[dfvfs.FindSpec]]) – find specifications used in path specification extraction.
- number_of_worker_processes (Optional[int]) – number of worker processes.
- status_update_callback (Optional[function]) – callback function for status updates.
- worker_memory_limit (Optional[int]) – maximum amount of memory a worker is allowed to consume, where None represents the default memory limit and 0 represents no limit.
Returns: processing status.
Return type:
-
plaso.multi_processing.task_manager module¶
The task manager.
-
class
plaso.multi_processing.task_manager.
TaskManager
[source]¶ Bases:
object
Manages tasks and tracks their completion and status.
A task being tracked by the manager must be in exactly one of the following states:
- abandoned: a task assumed to be abandoned because a tasks that has been
- queued or was processing exceeds the maximum inactive time.
- merging: a task that is being merged by the engine.
- pending_merge: the task has been processed and is ready to be merged with
- the session storage.
- processed: a worker has completed processing the task, but it is not ready
- to be merged into the session storage.
- processing: a worker is processing the task.
- queued: the task is waiting for a worker to start processing it. It is also
- possible that a worker has already completed the task, but no status update was collected from the worker while it processed the task.
Once the engine reports that a task is completely merged, it is removed from the task manager.
Tasks are considered “pending” when there is more work that needs to be done to complete these tasks. Pending applies to tasks that are: * not abandoned; * abandoned, but need to be retried.
Abandoned tasks without corresponding retry tasks are considered “failed” when the foreman is done processing.
-
CheckTaskToMerge
(task)[source]¶ Checks if the task should be merged.
Parameters: task (Task) – task. Returns: True if the task should be merged. Return type: bool Raises: KeyError
– if the task was not queued, processing or abandoned.
-
CompleteTask
(task)[source]¶ Completes a task.
The task is complete and can be removed from the task manager.
Parameters: task (Task) – task. Raises: KeyError
– if the task was not merging.
-
CreateRetryTask
()[source]¶ Creates a task that to retry a previously abandoned task.
Returns: - a task that was abandoned but should be retried or None if there are
- no abandoned tasks that should be retried.
Return type: Task
-
CreateTask
(session_identifier)[source]¶ Creates a task.
Parameters: session_identifier (str) – the identifier of the session the task is part of. Returns: task attribute container. Return type: Task
-
GetFailedTasks
()[source]¶ Retrieves all failed tasks.
Failed tasks are tasks that were abandoned and have no retry task once the foreman is done processing.
Returns: tasks. Return type: list[Task]
-
GetProcessedTaskByIdentifier
(task_identifier)[source]¶ Retrieves a task that has been processed.
Parameters: task_identifier (str) – unique identifier of the task. Returns: a task that has been processed. Return type: Task Raises: KeyError
– if the task was not processing, queued or abandoned.
-
GetStatusInformation
()[source]¶ Retrieves status information about the tasks.
Returns: tasks status information. Return type: TasksStatus
-
GetTaskPendingMerge
(current_task)[source]¶ Retrieves the first task that is pending merge or has a higher priority.
This function will check if there is a task with a higher merge priority than the current_task being merged. If so, that task with the higher priority is returned.
Parameters: current_task (Task) – current task being merged or None if no such task. Returns: - the next task to merge or None if there is no task pending merge or
- with a higher priority.
Return type: Task
-
HasPendingTasks
()[source]¶ Determines if there are tasks running or in need of retrying.
Returns: - True if there are tasks that are active, ready to be merged or
- need to be retried.
Return type: bool
-
RemoveTask
(task)[source]¶ Removes an abandoned task.
Parameters: task (Task) – task. Raises: KeyError
– if the task was not abandoned or the task was abandoned and was not retried.
-
SampleTaskStatus
(task, status)[source]¶ Takes a sample of the status of the task for profiling.
Parameters: - task (Task) – a task.
- status (str) – status.
-
StartProfiling
(configuration, identifier)[source]¶ Starts profiling.
Parameters: - configuration (ProfilingConfiguration) – profiling configuration.
- identifier (str) – identifier of the profiling session used to create the sample filename.
plaso.multi_processing.worker_process module¶
The multi-process worker process.
-
class
plaso.multi_processing.worker_process.
WorkerProcess
(task_queue, storage_writer, knowledge_base, session_identifier, processing_configuration, **kwargs)[source]¶ Bases:
plaso.multi_processing.base_process.MultiProcessBaseProcess
Class that defines a multi-processing worker process.