Logging and Audit Trail

Flux employs logging to help programmers and system administrators determine what Flux is doing at any particular moment. Log messages assist in tracing your interactions with Flux, with debugging, with tracing the internal steps that Flux takes to do its work, and with recording the actions taken by each workflow as it runs.

The audit trail is a mechanism for recording all the data that is involved in performing actions with Flux. The audit trail is intended to recreate the time sequence in which jobs run. The audit trail provides a full data dump of each significant step that Flux takes.

In Flux, the difference between logging and the audit trail is that logging is particularly useful to programmers and administrators. Logging tracks your interactions with Flux and helps you debug, trace, and monitor your application.

The audit trail, on the other hand, records the time sequence of significant events in Flux from an end user’s perspective and a full accounting of the data associated with each event. The audit trail is useful for determining what high-level steps have occurred and can be used to demonstrate that certain activities did (or did not) take place.

By default, Flux logs will be generated in the directory in which the Flux engine is running. To change the directory where logs are generated, set the INTERNAL_LOGGER_FILE_DIRECTORY engine configuration option. You can specify an absolute path to a directory, or a directory relative from the directory in which the engine is started.

Logging

Four Different Loggers Defined in Flux

Flux defines four different loggers:

Client Logger: Logs client API calls made to the flux.Engine interface. The instantiation of an engine object is logged to the System Logger, described below.
Flow Chart Logger: Logs steps taken by each workflow as it executes, including before and after each action or trigger executes.
System Logger: Logs the instantiation of an engine object, plus the inner workings of Flux. The system logger also logs the internal steps that a Flux engine instance performs to coordinate the user of Flux and the firings of jobs. Among other things, these internal steps include refreshing database connections and failing over jobs.
Audit Trail Logger: Records the time sequence of significant events that occur in Flux, including a full accounting of data associated with each of these significant events. The Audit Trail Logger logs the entry and exit points of client API calls made to the flux.Engine interface. It also logs the entry and exit points of each trigger and action that is executed in a running workflow. The complete data involved in each of these significant events logged by the Audit Trail Logger is provided to your log4j or JDK loggers. You can use this data to perform a complete data dump to your logging destination, such as a file or a database.

The data is delivered as an object to the Audit Trail Logger. This object implements the flux.audittrail.AbstractAuditTrailEvent interface as well as sub-interfaces of that interface. All of the audit trail events are defined as interfaces in the flux.audittrail package. Some of these interfaces’ names are prefixed with the word Abstract, meaning that they do not represent a concrete audit trail event. However, sub-interfaces of these abstract interfaces do represent concrete audit trail events.

When you configure Flux, you can set the names of each of these four loggers. You can set the names of the loggers in an engine configuration file:

CLIENT_LOGGER=fluxClient
FLOW_CHART_LOGGER=fluxWorkflow
SYSTEM_LOGGER=fluxSystem
AUDIT_TRAIL_LOGGER=fluxAuditTrail

Or in your Java code:

Configuration config = new Configuration();
config.setClientLogger("fluxClient");
config.setFlowChartLogger("fluxWorkflow");
config.setSystemLogger("fluxSystem");
config.setAuditTrailLogger("fluxAuditTrail");

The default names for each logger are listed in the flux.Configuration interface.

All of these loggers are always enabled. Configure your logging tool to decide which messages from which loggers you want to see.

Using Different Logging Tools with Flux

Flux supports log4j, the JDK logger, Apache Jakarta Commons Logging, the built-in internal synchronous and internal asynchronous loggers, a simple logger that prints messages to standard error, or no logger at all. By default, Flux logs error message to standard error, but this setting can be changed in your Flux configuration.

To enable different logging tools, set the Flux configuration property called LOGGER_TYPE. This configuration property is represented by an enumeration in the flux.LoggerType interface. The enumeration values are listed below.

LOG4J

Uses the Log4j2 facility.

JDK

Uses the logging facility built into JDK.

COMMONS_LOGGING

Uses Apache Jakarta Commons Logging.

STANDARD_ERROR

Uses a simple, built-in logging facility. Logs stack traces and error messages to standard error. Use this option if you want to see errors but do not want to use one of the standard logging systems such as log4j or the JDK logger.

If you use this logger, the distinctions between the Client, Flow Chart, System, and Audit Trail loggers are lost. When using this logger, only stack traces and error messages are logged. More detailed logging information is not available unless you switch to the log4j or JDK logger.

INTERNAL_SYNCHRONOUS

Uses an internal logger to log information to the database and to log files stored on the file system. Logging operations are stored synchronously to the database using the same database connection that is used by the Flux operation being executed.

This technique records information only if it is committed by the database connection used by the Flux operation. If a transaction is rolled back, log information from that transaction is also rolled back. For example, if an action encounters an error while executing, the logs for that action will be rolled back and give the appearance that the action was never executed.

When this logger is used, only the Flux logs set at the SEVERE and WARNING levels are stored in the database. The complete Flux logs (down to the FINEST logging level) are stored asynchronously in rotating log files on the file system. The active log file name is flux.log.

To configure the logging level for this logger, use the Configuration.setInternalLoggerLevel() method in your engine’s configuration. This property does not affect the logging level for any other loggers. By default, the internal logging level is set to FINEST.

Six log file rotations are stored using files named patterned after flux-<engine><date>.log.<number>, where “" is the name of the engine that created the log, "" is the date the log was generated, and "" is the log's position in the rotation. Each time a new log file is created in the rotation, existing log file numbers increase by one, such that flux-.log becomes flux - .log.1, flux - .log.1 becomes flux - .log.2, and so on. If all six log file rotations are already in use, flux-.log.5 will become flux.log.6, and the existing flux-.log.6 will be deleted.

To set the maximum size that the Flux log files can reach before being rotated, use the Configuration.setInternalLoggerFileRotationSize() method. Once the files reach the specified size, they will be rotated.

When this logger is used, all of the Flux audit trail entries are stored in the database in their entirety.

INTERNAL_ASYNCHRONOUS

The default logger type. Uses an internal logger to log information to the database and to log files stored on the file system. Logging operations are stored asynchronously to the database using this logger type using a different database connection than the Flux operation being executed. This technique increases throughput while still recording information that may be rolled back in the database connection used by the Flux operation. As a result, information from a transaction is logged even if the transaction is rolled back. For example, if an action encounters an error while executing, the transaction will be rolled back, but the logs will record that the action tried to execute and failed to complete.

In the event of a system crash while using internal asynchronous logging, a small amount of logging information that was not written to a file or to the database could be lost.

When this logger is used, all of the Flux audit trail entries are stored in the database in their entirety.

NULL

Logs nothing. No logging output is generated, displayed, or sent anywhere.

After a name is assigned to each of the four Flux loggers, logger-specific configurations can be set up. You can configure each of the four loggers individually. Log4j and the JDK logger support configuration via configuration files or API calls. Using these configurations, you can enable or disable different loggers, set the logging levels, etc.

When using log4j, the JDK logger, or Commons Logging, the name you assign to each logger is used to link that logger to the logging mechanism of your choice. Once this linkage is established, you can configure the Flux loggers just like you would configure your own application loggers.

For an example of using and configuring the log4j logging tool with Flux, refer to the “Logging” example located in /examples/software_developers/logging under your Flux installation directory.

By linking Flux to a logger mechanism, you can have Flux and your application share the same logger. To avoid concurrency problems when Flux and your application want to use a logger at the same time, Flux always uses the synchronization policy of synchronizing on a logger object before calling methods on it. If you configure Flux to share loggers with your application, you should use this same synchronization policy. Note that the log4j documentation states that log4j logger objects are thread-safe, so if you use log4j, you may not need to synchronize on log4j logger objects when sharing them with Flux.

Using log4j to Monitor Engine Health

If you are using the log4j logger, you can easily tie Flux into log4j monitoring and notification capabilities to obtain notification in the event of an engine failure that prevents workflows or the engine from running normally.

In particular, many users take advantage of log4j’s SMTP capabilities to receive a mail notification when an error prevents Flux from running normally. Since the log4j mail notifications are generated outside of Flux, this allows email notifications to occur even when the engine itself is blocked or deadlocked.

In particular, you can monitor for the following error notification in the logs, which will occur if the engine is stuck and cannot continue processing:

The Flux engine’s primary flow of control is alive, but it is blocked or deadlocked and cannot make progress

Setting the Internal Logger Level

When you are using one of the internal logger tools (either synchronous or asynchronous), there are seven different logging levels available. These levels correspond to the logging levels in the JDK; however, there are standard mappings to Log4j. These standard mappings are described within each logging level.

The logging level is set using the INTERNAL_LOGGER_LEVEL engine configuration property. To set the logger level to FINEST, for example, you would set the following in your configuration:

INTERNAL_LOGGER_LEVEL=FINEST

The available logging levels are:

FINEST: Highly detailed tracing messages. Corresponds to the Log4j logging level org.apache.log4j.Priority.DEBUG. Also corresponds to the Commons Logging level “trace”.
FINER: Fairly detailed tracing messages. Corresponds to the Log4j logging level org.apache.log4j.Priority.DEBUG. Also corresponds to the Commons Logging level “trace”.
FINE: Basic tracing messages. Corresponds to the Log4j logging level org.apache.log4j.Priority.DEBUG. Also corresponds to the Commons Logging level “debug”.
CONFIG: Static configuration messages. Corresponds to the Log4j logging level org.apache.log4j.Priority.DEBUG. Also corresponds to the Commons Logging level “debug”.
INFO: Informational messages. Corresponds to the Log4j logging level org.apache.log4j.Priority.INFO. Also corresponds to the Commons Logging level “info”.
WARNING: Potential problems. Corresponds to the Log4j logging level org.apache.log4j.Priority.WARN. Also corresponds to the Commons Logging level “warning”.
SEVERE: Serious failures. Corresponds to the Log4j logging level org.apache.log4j.Priority.ERROR. Also corresponds to the Commons Logging level “error”.

Every logging level also logs the information from the next level down. For example, when using the WARNING level any messages from the SEVERE level are also logged, but when using the SEVERE level, only messages from that level are logged.

Multiple Loggers

Multiple loggers in Flux provide the ability to use multiple logging tools within Flux. This allows you to use several different loggers with Flux; for example, you may want to view logs and audit trail entries in the Operations Console, but also send events to your log4j audit trail listeners. To accomplish this, you could simply specify both the flux.logging.LoggerType.INTERNAL_ASYNCHRONOUS and flux.logging.LOG4J loggers in your engine configuration.

NOTE: It is illegal to specify both LoggerType.ASYCHRONOUS and LoggerType.SYNCHRONOUS in the same configuration. It is also invalid to use LoggerType.NULL in addition to other logger types.

To specify multiple loggers in your engine configuration file, you set LOGGER_TYPES.<number>=<type>, where “<number>” is the logger’s position in the list and “<type>” is the logging tool that should be used. To use the Internal Asynchronous and Log4j loggers, for example, you would set the following in the engine configuration:

LOGGER_TYPES.0=INTERNAL_ASYNCHRONOUS
LOGGER_TYPES.1=LOG4J

Audit Trail

Filtering the Audit Trail

By default, Flux will audit all events. If there is a significant amount of activity on the engine, this can lower performance, cause unnecessary data to be stored in the database, and impact your ability to find the events that you are actually interested in.

The solution is to apply an audit trail filter that only allows the events that you require.

You can set an audit trail filter in the engine configuration. The audit trail filter is applied by setting AUDIT_TRAIL_FILTER.=, where "" is the event's position in the list and "" is the fully qualified class name of the event.

For a complete list of fully qualified audit trail event class names (and a description of each), see the tables just beneath the example below.

As an example, if you only wanted to audit the events for entering, executing, and exiting actions and triggers, you would supply the following filter:

AUDIT_TRAIL_FILTER.0=flux.audittrail.server.EnteringActionEvent
AUDIT_TRAIL_FILTER.1=flux.audittrail.server.EnteringTriggerEvent
AUDIT_TRAIL_FILTER.2=flux.audittrail.server.ExecutingActionEvent
AUDIT_TRAIL_FILTER.3=flux.audittrail.server.ExitingActionEvent
AUDIT_TRAIL_FILTER.4=flux.audittrail.server.ExitingTriggerEvent

With the filter above, no audit trail events will be recorded except the items specifically listed.Au

List of Audit Trail Events

The following tables contain all of the client, server, and engine status audit trail events. You can use these tables when determining which audit trail events to include in your filter.

Client Events

These events are registered when a Flux client interacts with the engine, or when the engine interacts with the database. A client could be a piece of software using the Flux Java APIs or a user accessing the Operations Console.

Typically, these events (with the possible exception of the FlowChartRemovedEvent) would not be included in the audit trail filter. Unless a complete record of API usage is required, these events are not normally helpful in tracing workflow execution or debugging problems and can be safely excluded from the audit trail.

Fully Qualified Class Name	Event Name in Operations Console Filter	Description
flux.audittrail.client.EnteringCallEvent	Entering Call	An API call or client connection to the engine started (this may be logged, for example, when a user connects to the Operations Console).
flux.audittrail.client.ExitingCallEvent	Exiting Call	An API call or client connection to the engine ended normally.
flux.audittrail.client.ExitingCallWithErrorEvent	Exiting Call with Error	An API call ended with an error (that is, the API call threw an Exception).
flux.audittrail.client.FlowChartRemovedEvent	Workflow Removed	A workflow was removed from the engine (this could happen if a user manually removes the workflow, or if the workflow finishes normally).
flux.audittrail.client.J2seTransactionCommittedEvent	J2se Transaction Committed	A database transaction involved in a J2SE client call was committed (that is, when the engine configuration is using direct JDBC rather than a data source). Due to the nature of J2EE transaction servers, there is no equivalent event for J2EE transaction commits.
flux.audittrail.client.J2seTransactionRolledBackEvent	J2se Transaction Rolled Back	A database transaction involved in a J2SE client call was rolled back. As with the transaction committed event, there is no equivalent event for J2EE transactions.

Server Events

These events are recorded by the Flux engine when it performs actions (like executing a trigger or action within a workflow). These events are especially useful when viewing the execution history of a workflow, for either debugging or historical data purposes.

Fully Qualified Class Name	Event Name in Operations Console Filter	Description
flux.audittrail.server.ActionTimeoutEvent	Action Timeout	A trigger or action has reached its timeout before it fired or completed execution (in other words, the trigger or action timed out).
flux.audittrail.server.DeadlineApproachingEvent	Deadline Approaching	A workflow or workflow run has entered its deadline window and is now approaching the deadline.
flux.audittrail.server.DeadlineExceededEvent	Deadline Exceeded	A workflow or workflow has passed its deadline.
flux.audittrail.server.DeferringExecutionFlowEvent	Deferring Flow Execution	An execution flow is stopping for some time or waiting for a condition before continuing. This might occur for a few reasons: 1. A trigger has been reached, but it is not yet ready to fire. 2. The workflow containing the execution flow has been paused. 3. The execution flow has reached a Process Action that is configured to run on an agent, but there is not yet an agent available to run the process.
flux.audittrail.server.EndingRunEvent	Ending Run	A workflow run has finished (that is, the workflow has reached the action with the “End of Run” property enabled).
flux.audittrail.server.EnteringActionEvent	Entering Action	A workflow has entered an action. This event only means that a workflow has reached the action – it does not, however, mean that the action has actually executed (which is indicated by the ExecutingActionEvent).
flux.audittrail.server.EnteringTriggerEvent	Entering Trigger	A workflow has entered a trigger. This event indicates that a workflow has reached the trigger and the trigger is ready to begin polling — it does not, however, mean that the trigger has actually fired (which is indicated by the ExecutingActionEvent).
flux.audittrail.server.ExecutingActionEvent	Executing Action	A workflow has begun executing a trigger or action, and the trigger has provided an update to the engine on its execution status. Note that not all triggers and actions provide such updates.
flux.audittrail.server.ExitingActionEvent	Exiting Action	An action has finished executing normally (without encountering an error that stopped execution). The workflow is continuing on in the workflow.
flux.audittrail.server.ExitingActionOnErrorEvent	Exiting Action on Error	An action has encountered an error during its execution that has caused execution to stop and the action to exit.
flux.audittrail.server.ExitingActionOnSignalEvent	Exiting Action on Signal	An action is exiting because a signal is raised that indicates the action should stop.
flux.audittrail.server.ExitingActionOnTimeoutEvent	Exiting Action on Timeout	An action is exiting after having timed out.
flux.audittrail.server.ExitingTriggerEvent	Exiting Trigger	A trigger has fired successfully (without encountering an error that caused the trigger to fail). The workflow is continuing on in the workflow.
flux.audittrail.server.ExitingTriggerOnErrorEvent	Exiting Trigger on Error	A trigger has encountered an error and failed to fire.
flux.audittrail.server.ExitingTriggerOnSignalEvent	Exiting Trigger on Signal	A trigger is exiting because a signal is raised that indicates the trigger should stop polling.
flux.audittrail.server.ExitingTriggerOnTimeoutEvent	Exiting Trigger on Timeout	A trigger has timed out and is exiting without firing.
flux.audittrail.server.ExitingTriggerWithoutFiringEvent	Exiting Trigger without Firing	A trigger has polled to determine whether it is ready to fire, but its condition was not successful (the trigger was not ready to fire). The trigger is exiting and waiting until its next polling time to check the condition again. This event is recorded each time the trigger polls until it fires successfully.
flux.audittrail.server.FinishingExecutionFlowEvent	Finishing Execution Flow	An execution flow has finished. This does not necessarily mean that the workflow as a whole has finished; If the workflow contains a split, this event is recorded for each execution flow branch in the workflow. This event is also recorded any time a workflow reaches a join point.
flux.audittrail.server.FinishingFlowChartEvent	Finishing Workflow	The workflow has finished and will be removed from the engine. A workflow is considered finished when all its execution flows have finished.
flux.audittrail.server.FlowChartFailoverEvent	Workflow Failover	A workflow has been failed over (the workflow’s heartbeat in the database was not updated within the engine’s failover time window) and the workflow has been “released” from its engine. Another engine in the cluster will assume execution of the workflow.
flux.audittrail.server.StartingExecutionFlowEvent	Starting Execution Flow	An execution flow has started. This could mean that the workflow is just beginning and is executing its start action, or it could mean that the workflow has encountered a split and one or more new execution flows have been created.
flux.audittrail.server.StartingRunEvent	Starting Run	A workflow run has started (that is, the trigger or action with the “Start of Run” property enabled has been reached). Note that this event is logged at the same time as the EnteringActionEvent / EnteringTriggerEvent for the “Start of Run” trigger or action, not the ExecutingActionEvent.

Status Events

These events are generated by the engine automatically to provide updates on its internal status and health.

Fully Qualified Class Name	Event Name in Operations Console Filter	Description
flux.audittrail.status.EngineContentsSummary	not available in Operations Console	A complete summary about the state of each workflow in the Flux engine (generated according to the frequency specified by the engine configuration option ENGINE_CONTENTS_SUMMARY_FREQUENCY). This event is also recorded to the Flux logs at the INFO logging level.
flux.audittrail.status.EngineDatabaseCommandsFailing	Engine Database Commands Failing	If this event occurs, at least five closely spaced database commands issued from the Flux engine to the database have failed with an error. This usually means that the database server is down, the database cannot be contacted (possibly due to network outages or interference), or that the Flux schema contains an error or is misconfigured. This event is delivered at a regular interval until the database error condition is no longer detected.
flux.audittrail.status.EngineMainBlocked	Engine Main Blocked	The Flux engine is alive, but is blocked or deadlocked and cannot make progress. This indicates that the engine has become stuck and is not able to execute or assign any workflows.
flux.audittrail.status.EngineMainStopped	Engine Main Stopped	The Flux engine has stopped due to an error condition. This event is not generated if the engine is stopped normally.
flux.audittrail.status.FlowContextDetails	not available in Operations Console	Contains status information about a flow context (execution flow) running within a workflow.
flux.audittrail.status.Ok	not available in Operations Console	This indicates that the engine is executing normally and no error condition has been detected. This heartbeat is generated at a frequency specified by the HEARTBEAT_FREQUENCY engine configuration option.

Recommended Audit Trail Filter

The audit trail filter settings below are recommended for typical use cases of Flux (although, as always, your particular requirements might vary, so you may find that you will need to modify this filter to better suit your specific environment).

AUDIT_TRAIL_FILTER.1=flux.audittrail.server.DeadlineApproachingEvent
AUDIT_TRAIL_FILTER.2=flux.audittrail.server.DeadlineExceededEvent
AUDIT_TRAIL_FILTER.3=flux.audittrail.server.DeferringExecutionFlowEvent
AUDIT_TRAIL_FILTER.4=flux.audittrail.server.EndingRunEvent
AUDIT_TRAIL_FILTER.5=flux.audittrail.status.EngineDatabaseCommandsFailing
AUDIT_TRAIL_FILTER.6=flux.audittrail.status.EngineMainBlocked
AUDIT_TRAIL_FILTER.7=flux.audittrail.status.EngineMainStopped
AUDIT_TRAIL_FILTER.8=flux.audittrail.server.EnteringActionEvent
AUDIT_TRAIL_FILTER.9=flux.audittrail.server.EnteringTriggerEvent
AUDIT_TRAIL_FILTER.10=flux.audittrail.status.ErrorCondition
AUDIT_TRAIL_FILTER.11=flux.audittrail.server.ExecutingActionEvent
AUDIT_TRAIL_FILTER.12=flux.audittrail.server.ExitingActionEvent
AUDIT_TRAIL_FILTER.13=flux.audittrail.server.ExitingTriggerEvent
AUDIT_TRAIL_FILTER.14=flux.audittrail.server.ExitingActionOnErrorEvent
AUDIT_TRAIL_FILTER.15=flux.audittrail.server.ExitingActionOnSignalEvent
AUDIT_TRAIL_FILTER.16=flux.audittrail.server.ExitingActionOnTimeoutEvent
AUDIT_TRAIL_FILTER.17=flux.audittrail.server.ExitingTriggerOnErrorEvent
AUDIT_TRAIL_FILTER.18=flux.audittrail.server.ExitingTriggerOnSignalEvent
AUDIT_TRAIL_FILTER.19=flux.audittrail.server.ExitingTriggerOnTimeoutEvent
AUDIT_TRAIL_FILTER.20=flux.audittrail.server.ExitingTriggerWithoutFiringEvent
AUDIT_TRAIL_FILTER.21=flux.audittrail.server.FinishingExecutionFlowEvent
AUDIT_TRAIL_FILTER.22=flux.audittrail.server.FinishingFlowChartEvent
AUDIT_TRAIL_FILTER.23=flux.audittrail.server.FlowChartFailoverEvent
AUDIT_TRAIL_FILTER.24=flux.audittrail.client.FlowChartRemovedEvent
AUDIT_TRAIL_FILTER.25=flux.audittrail.server.StartingExecutionFlowEvent
AUDIT_TRAIL_FILTER.26=flux.audittrail.server.StartingRunEvent

Disabling the Audit Trail

You can disable the audit trail completely by setting an empty audit trail filter in your engine configuration, like so:

AUDIT_TRAIL_FILTER.0=

With an empty filter, no audit trail events will be recorded.

Filtering Custom Events

To add a custom event to the audit trail filter, just add the name of your custom event (in place of the class name that would normally be added). This might look like this:

AUDIT_TRAIL_FILTER.0=MyCustomEventName

Creating Audit Trail Listeners

The audit trail serves an additional purpose. You can use the audit trail to create various kinds of job listeners. For example, you can notify your application when different audit trail events occur, such as when the engine is started or when a job fires.

To be sure, Flux’s workflow model already serves this purpose well. The Flux workflow model is vastly superior to using a model of listeners. It is easier to coordinate a job’s activities using a workflow and its workflow model than chaining together listeners manually.

That being said, listeners can still be useful. A listener could deliver a message when a Flux engine is disposed. The delivery of that message can trigger other activities within your application.

A listener could also deliver a message to a monitoring user interface when a job fires or finishes. This user interface could display recent job firing activity.

To install an audit trail listener, you must use the JDK logger.

JDK: Configure a Formatter using the JDK’s configuration files or APIs.
- java.util.logging.Formatter: Sub-class this class and implement its format() method. In the body of this method, call LogRecord.getParameters() on the format() argument. Extract the first element from the returned object array. Cast that first element to a flux.audittrail.AbstractAuditTrailEvent object. Then process and re-deliver the audit trail event to its destination in your application.

Heartbeat

The heartbeat event contains general diagnostic information about the health of the Flux engine instance that generated it. This event is generated according to the frequency specified in the HEARTBEAT_FREQUENCY engine configuration property. However, some heartbeat events are generated as soon as they occur. In particular, the EngineDatabaseCommandsFailing heartbeat event is generated and delivered as soon as it occurs.

In general, there are four types of heartbeats: Ok, EngineMainBlocked, EngineMainStopped, and EngineDatabaseCommandsFailing. The Ok, EngineMainBlocked, and EngineMainStopped heartbeats all give diagnostics for the engine at the heartbeat frequency, which defaults to one hour. The EngineDatabaseCommandsFailing heartbeat, however, returns the problem immediately following the occurrence of the error.

The Ok heartbeat indicates that no error has been diagnosed in the Flux engine. This heartbeat event is sent to the audit trail as well as the logs at the info logging level. An example of the printout for this heartbeat is shown below.

The EngineMainStopped heartbeat indicates that the Flux engine’s primary flow of control has stopped due to an error condition. This event is not generated if the Flux engine is stopped during the normal course of operation.

The EngineMainBlocked heartbeat indicates that the Flux engine’s primary flow of control is alive, but it is blocked or deadlocked and cannot make progress.

The EngineDatabaseCommandsFailing indicates that at least five closely spaced database commands issued by the Flux engine have all failed with an error. Typically, this error condition indicates that the database server is down, the database server cannot be contacted, or the Flux database schema contains errors or is misconfigured. This event is delivered as soon as this error condition is detected. Furthermore, until it is identified that this error condition no longer exists, this event is delivered at the regular heartbeat reporting interval.

Below is code for generating a Flux engine. This code also incorporates a logger to record the engine’s events. A logger is needed to receive the heartbeat information for the Flux engine.

Configuration configuration = factory.makeConfiguration();

// Set heartbeat frequency and logger type.
configuration.setHeartbeatFrequency("+10s");
configuration.setName("Flux");
configuration.setLoggerType(LoggerType.LOG4J);

// Create the Flux engine.
Engine engine = factory.makeEngine(configuration);

Once this code is executed, the logger will begin recording events from the engine. Notice that the setHeartbeatFrequency() method is set to “+10s”, meaning that the heartbeat of the engine will be printed out every ten seconds. An example printout is shown below of an Ok heartbeat.

Jun-14 15:48:25 system - #### START HEARTBEAT MESSAGE ####

Name: flux.audittrail.status.Ok

Engine Name: Flux

Free Memory: 1.00 MB

Total Memory: 4.14 MB

Maximum Connections: 5

Occupied Connections: 0

Running Flow Charts: 0

#### END HEARTBEAT MESSAGE ####

Jun-14 15:48:25 audit_trail - fluximpl.core.heartbeat.OkImpl@676e3f

Using the Audit Trail for Workflow Dependencies

Workflows often need to coordinate with each other. As a simple example, one workflow may need to wait for another workflow to finish before it can continue. As another example, one workflow may process files one at a time. As each file is processed, another workflow may need to be notified so it can continue processing the file.

These kinds of dependencies can be coordinated using the audit trail.

To set up this kind of dependency, one workflow needs to “watch” for other workflows to finish or to reach milestones. “Workflow watching” is performed using the Audit Trail Trigger. By setting audit trail search criteria on the Audit Trail Trigger, a workflow can monitor certain workflows to determine when they have finished or reached certain milestones.

To define a milestone, a running workflow needs to send an audit trail event to the audit trail at the appropriate time. An audit trail event has a name and a message. As a running workflow reaches its milestones, it can send its audit trail event to the audit trail using the FlowContext.sendToAuditTrail() method.

Using the audit trail in this manner provides a simple mechanism for workflow dependencies without having to resort to Flux’s messaging paradigm.

Logging and the Operations Console

The Operations Console contains a special page, called the logs page, that displays a history of SEVERE or WARNING level log messages from the engine (the amount of history available from this page depends on the log expiration set on the engine – see below for more information). The logs page is available under the “Reports” tab in the console.

In order to view the logging page of the Operations Console, you must be using the internal Flux logging facility. This means that either the INTERNAL_SYNCHRONOUS or INTERNAL_ASYNCHRONOUS logger must be enabled to use the logging page for a particular engine.

Log, Audit Trail, and Run History Expiration

When Flux logging and audit trail features are used, the log and audit trail entries can become exceptionally large (especially the audit trail). To avoid this, it is useful to set the LOG_EXPIRATION and AUDIT_TRAIL_EXPIRATION options in your engine configuration. These options specify time expressions that indicate how long individual log and audit trail entries are saved before they are automatically pruned. A “null” log expiration indicates that log entries are never deleted from the database. If you are using the default H2 in-memory database, both of these properties default to “+1H”, or one hour. Otherwise, the default value for each is “+w”, or one week.

The AUDIT_TRAIL_EXPIRATION also determines when run history and run average entries expire as well. Keep this in mind when determining the best value for the audit trail expiration in your environment.

To set the logging and audit trail (and run average/run history) to 24 hours each, you could set the following in your configuration:

LOG_EXPIRATION=+24H
AUDIT_TRAIL_EXPIRATION=+24H