DOM/ECMAScript "Agents"

THIS PROPOSAL IS OBSOLETE. It has been split into a PR on the ECMAScript spec and bits that are specific to the shared memory proposal.

OBSOLETE DRAFT. Revised: 2016-03-24. lhansen@mozilla.com

Introduction#

This is a companion spec to the Shared memory and Atomics specification ("SAB spec", for short). The purpose of the present specification is to formalize what the SAB spec means when it refers to "agent", to place the runtime semantic operations that the SAB spec depends on on solid ground, and to hook "agent" into the ES6 spec at appropriate points (not very many).

This companion spec has two parts, one for the ECMAScript spec and one for the DOM/HTML spec. The latter is a secondary concern to this spec's author.

Terminology: I'm going to use "agent" for the time being since that is what the SAB spec uses; C++ uses it too, for its similar concept. "Vat" and "continent" would be OK too, I suggest we bikeshed a name later.

Obvious discussion items:

Relevant links:

Changelog:

1Executable Code and Execution Contexts (ES6 8)#

1.1Execution contexts (ES6 8.3)#

Paragraph one: Change the last two sentences to the following (changes bolded):

At any point in time, there is at most one execution context per agent that is actually executing code. This is known as the agent's running execution context. All references to the running execution context in this specification reference the running execution context of the surrounding agent.

1.2Jobs and Job Queues (ES6 8.4)#

In the paragraph between tables 25 and table 26, append the following sentence:

Each agent has its own set of named job queues. All references to a named job queue in this specification, such as within the EnqueueJob and NextJob semantic functions, reference the named queue in the surrounding agent.

1.2.1Forward Progress Guarantees#

Implementations must ensure that all unblocked jobs in agents whose [[CanBlock]] property is true eventually make progress.

Note 1

For example, if there are two agents A and B where neither is waiting for an event, and A either enters an infinite loop or blocks in a synchronous operation such as Atomics.futexWait without ever being unblocked, then B will still execute further steps of its program.

Provided that no job run by an agent whose [[CanBlock]] property is false blocks forever, implementations must ensure that all unblocked jobs in agents whose [[CanBlock]] property is false eventually make progress.

Note 2

Agents that share an execution thread can be blocked from making forward progress if any job in any of those agents enters an infinite loop, for example.

Note 3

(Issue 28) This section may be formalized at least partly in the terms of the C++ working paper on forward progress, see the referenced issue, or it may be left as-is. An ES6 execution thread is most plausibly a 'Concurrent agent' in the terms of that paper.

Some work is going on in the HTML spec (see this issue) to make it possible to control whether a worker can share an execution thread or not.

1.3Agents (NEW)#

An agent comprises a set of ECMAScript execution contexts, an execution context stack, a running execution context, a set of named job queues, an Agent Record, and an executing thread. Except for the executing thread, the constituents of an agent belong exclusively to that agent.

An agent's executing thread executes the jobs in the agent's job queues on the agent's execution contexts independently of other agents. An execution thread may be used as the execution thread by multiple agents only if none of the agents sharing the thread have a [[CanBlock]] property that is true.

Note 1

This is a case of bowing to reality, since browsers do share the main thread across tabs in the same process and will prohibit main-thread code from blocking. Forward progress is guaranteed by implementations that set a time limit on how long the main thread can execute in a tab without returning to its event loop.

While an agent's executing thread executes the jobs in the agent's job queues, the agent is the surrounding agent for the code in those jobs. The code uses the surrounding agent to access the specification level execution objects held within the agent: the running execution context, the execution context stack, the named job queues, and the agent record's fields.

Table 1: Agent Record Fields
Field name Value Meaning
[[LittleEndian]] Boolean The default value computed for the isLittleEndian parameter when it is needed by the algorithms GetValueFromBuffer and SetValueInBuffer. The choice is implementation dependent and should be the alternative that is most efficient for the implementation. Constant.
[[CanBlock]] Boolean Determines whether the agent can block (to wait for an external unblock event) or not. Constant.
[[State]] String The state of the agent. It can take on the following values:
  • "creating" : agent is in the process of being created
  • "waiting" : agent is waiting for work to appear in a queue
  • "running" : agent is processing a job
  • "blocked" : agent has put itself to sleep
  • "destroying" : agent is in the process of being destroyed
  • "destroyed" : agent has been destroyed
[[Signifier]] A value that admits equality testing Uniquely identifies the agent within its agent cluster. Constant.
[[IsLockFree1]] Boolean True iff atomic operations on one-byte values are lock-free. Constant.
[[IsLockFree2]] Boolean True iff atomic operations on two-byte values are lock-free. Constant.
Note 2

(Spec draft note) The [[State]] values are not yet being used, but will be used by to-be-drafted semantic functions in the SAB spec, and may be used here to discuss agent termination.

Note 3

The values of [[IsLockFree1]] and [[IsLockFree2]] are not necessarily constant on a given piece of hardware, but may reflect implementation choices that can vary over time and between ECMAScript implemenations.

There is no [[IsLockFree4]] property: 4-byte atomic operations are always lock-free.

An agent is a specification mechanism and need not correspond to any particular artefact of an ECMAScript implementation.

Note 4

There are several ways in which agents could be meaningful without having a direct representation.

The standalone shell for Firefox's JS engine allows workers to be created but has no JS representation for those workers. SharedArrayBuffers are shared not through messages but through a global mailbox mechanism.

The MessagePort mechanism in HTML5 allows workers to be passed around implicitly; it is possible to communicate with a worker without having a direct representation for the worker, only for a channel that the worker listens on.

1.4Agent Clusters (NEW)#

An agent cluster is a maximal set of agents that can communicate by operating on shared memory.

Note 1

Programs within different agents may share memory by unspecified means. At a minimum, the backing memory for SharedArrayBuffer objects can be shared among the agents in the cluster.

There may be agents that can communicate by message passing that cannot share memory; they are never in the same cluster.

Every agent belongs to exactly one agent cluster.

Note 2

The agents in a cluster need not all be alive at some particular point in time. If agent A creates another agent B, after which A terminates and B creates agent C, the three agents are in the same cluster if A could share some memory with B and B could share some memory with C.

All agents within a cluster must have the same value for the [[LittleEndian]] property in their respective Agent Records.

Note 3

(Spec draft note) Is the restriction on [[LittleEndian]] overreach? It seemed reasonable at the time, but on some architectures it's possible for different processes to have different endianness. The rule precludes sharing memory among such processes, in any ECMAScript implementation.

All agents within a cluster must have the same values for the [[IsLockFree1]] property in their respective Agent Records; similarly for the [[IsLockFree2]] property.

All agents within a cluster must have different values for the [[Signifier]] property in their respective Agent Records.

An agent cluster is a specification mechanism and need not correspond to any particular artefact of an ECMAScript implementation.

An embedding may suspend and wake an agent without the agent's knowledge or cooperation. If the embedding does so, it must suspend and wake all agents in an agent cluster together.

Note 4

The purpose of that restriction is to avoid a situation where a worker deadlocks or starves because another agent has been suspended. For example, if a DOM SharedWorker shares memory with a regular worker, and the regular worker is suspended while it holds a lock (because the web page the regular worker is in is pushed into the window history), and the SharedWorker tries to acquire the lock, then the SharedWorker will be blocked until the regular worker wakes up again, if ever. Meanwhile other workers trying to access the SharedWorker from other web pages will starve.

The implication of the restriction is that it will not be possible to share memory between agents that don't belong to the same suspend/wake collective within the embedding.

(Issue 39) That, in turn, places interesting demands on the structured clone algorithm in web browsers.

An embedding may terminate an agent without any of the agent's cluster's other agents' prior knowledge or cooperation. If an agent is terminated not by programmatic action of its own or of another agent in the cluster but by forces external to the cluster, then the embedding has two choices: Either terminate all the agents in the cluster, or provide reliable APIs that allow the agents in the cluster to coordinate so that at least one remaining member of the cluster will be able to detect the termination, with the termination data containing enough information to identify the agent that was terminated.

Note 5

Examples of that type of termination are: operating systems or users terminating agents that are running in separate processes; the embedding itself terminating an agent that is running in-process with the other agents when per-agent resource accounting indicates that the agent is runaway.

Note 6

(Spec draft note) Issue 55 contains discussions around the termination signaling requirement. It is not uncontroversial.

The shared memory spec will additionally require that if termination is signaled then the signal creates the necessary happens-before edge in the memory ordering, which is a tougher requirement than the requirement that it be possible to detect termination.

The web platform provides nothing at the moment to detect or signal termination, regardless of shared memory, and this is already a problem for developers. Agent clusters are more tightly coupled than pages that can signal each other, so the problem may be easier to solve for agent clusters. In current browsers dedicated workers are anyway in-process and an agent cluster will normally terminate en masse, so the requirement is satisfied for now.

1.5Inter-Agent Communication (NEW)#

(Removed; section kept to avoid changing the numbering)

1.6External suspension of agents (NEW)#

(Integrated into the Agent Clusters section; section kept to avoid changing the numbering)

1.7External termination of agents (NEW)#

(Integrated into the Agent Clusters section; section kept to avoid changing the numbering)

2Structured Data (ES6 24)#

2.1ArrayBuffer Objects (ES6 24.1)#

2.1.1Abstract Operations for ArrayBuffer (ES6 24.1.1)#

2.1.1.1GetValueFromBuffer( arrayBuffer, byteIndex, type [, isLittleEndian] ) (ES6 24.1.1.5)#

Replace step 7 of this algorithm with the following:

  • if isLittleEndian is not present, set isLittleEndian to the value of the [[LittleEndian]] property of the surrounding agent's Agent Record.

2.1.1.2SetValueInBuffer( arrayBuffer, byteIndex, type, value [, isLittleEndian] ) (ES6 24.1.1.6)#

Replace step 8 of this algorithm with the following:

  • if isLittleEndian is not present, set isLittleEndian to the value of the [[LittleEndian]] property of the surrounding agent's Agent Record.

3Web browser embedding (for the HTML/DOM spec)#

3.1Agent mapping#

In a web browser an agent is an HTML event loop [here].

Browsers will typically let agents that run on the browser's main thread have [[CanBlock]]=false. Browsers that support several tabs within the same content process may share a single agent execution thread across all the tabs in that case (Firefox does).

3.2Clarifications and changes to Web Worker semantics#

Note

Several of these issues have been reported as bugs against the WHATWG spec, see this bug report.

3.2.1Actions to start a worker#

(Clarification) The only action required to start a worker is to call "new Worker()".

Note 1

(Spec draft note) Firefox requires a trip through the event loop to start the worker. I'm told this won't be fixed, spec or no spec.

The current state of affairs is a minor deadlock hazard. If one worker creates another and then blocks on a futex waiting for the new worker to unblock it, the workers may deadlock since the new one may not have been created. The hazard is reduced by (a) prohibiting blocking on the main thread and (b) either disallowing nested workers or not requiring workers to return to their event loops to get a nested worker started.

(Compatible change) If a worker cannot be started for reasons of resource exhaustion (notably, no threads available, including arbitrary implementation limits on the number of threads) then an error must be reported in some manner TBD (TODO).

Note 2

(Spec draft note) In current Firefox, there is a per-domain limit on the number of workers. An attempt to create a worker will silently not start the worker if the limit has been reached; the worker will be queued and started when another worker has terminated. Again, this creates deadlock hazards.

There are other error situations during worker startup that can't necessarily be signaled synchronously, notably, a load error on the URL. Those have to be signaled via an event callback, or the creating agent must poll the state of the worker to see if it enters an error state (see more below on the state). A callback is probably cleanest, but we'll want the state variable anyway, so maybe both.

3.2.2Workers are agents#

(Clarification) Every worker implements an independent agent as defined above.

Note

Thus the forward-progress guarantee of jobs also applies to workers.

3.2.3Curtail the license to kill#

(Compatible change) Workers may be killed by the browser only for specific reasons. These reasons are TBD (TODO) but include evicting the owning page from the history and closing the owning page. These reasons do not include workers that 'run too long'.

Note

Currently the WHATWG spec allows the browser to kill any worker at any time. The purpose of the rule is probably a combination of the need to stop runaway scripts (without the normal slow-script dialog) and the need to remove workers once a page is evicted from the browser cache or a tab is closed. However, the wording is overly broad. Also, common uses of workers for computation conflict with the ability to detect "runaway" agents:

  • The worker may perform a genuinely long-running computation
  • The worker will have its own user-implemented "event" loop, communicating synchronously through shared memory; it will not use the browser's event loop

3.2.4Termination detection#

(Compatible(?) change) The worker object should have a read-only property called "state" whose value represents the state of the worker (eg a string naming the state). One state might be "terminated", indicating the worker is dead.

Note 1

There appears to be no way at present to directly determine whether a Worker has terminated.

Note 2

Possible complementary mechanisms include throwing an exception when a message is sent to a terminated worker, and to send an error event to the creating agent when a worker is killed.

Note 3

The spec here can be related back to the properties of the Agent Record, specified earlier.

3.3Agent termination#

If a worker is terminated by a call to its terminate method while it is blocked in then the worker is first woken and then immediately terminated (the wakeup is not observable to code running in the worker).

Note 1

(Spec draft note) Firefox supports close handlers that could in principle be used to clean up inconsistent state, but I think that's not in the HTML5 spec.

If a worker is terminated for any other reason, such as the user agent reloading or closing the window or frame, and the worker is blocked in a call to futexWait when it is terminated, then the worker is first woken and then immediately terminated (the wakeup is not observable to code running in the worker).

Note 2

(Spec draft note) After the running script has been terminated the worker's close handler(s) will be run as if the wait had not been aborted.