3.Precision, Delay, and
Resource Requirement Models
Table of Contents
1.Overview of the ARIA Framework
2.Formal Specification of the Architecture
4.DANS: ARIA toward the Distributed Networks
Overview of the ARIA Framework
The main components of the ARIA architecture (Figure 1) consist of
Figure 1: ARIA architecture
The Intelligent Stage at DMA is equipped with various video cameras and pressure sensors. The new sensors that are being installed into the Intelligent Stage include: floor mat sensors to generate data
for object localization and motion classification, microphone arrays for sound localization and beamforming, Vicon 8i
Realtime system with 8 cameras for tracking the position and motion of human body that is appropriately marked with
reflective markers, ultrasound sensors for object tracking, and video sensors for determining the presence or absence of
marked persons, determining their relative spatial positions, and detecting certain simple events. The Intelligent Stage
will act as a test-bed for validating the results of the proposed research.
Interactive performances will require an innovative information architecture that will process, filter, and fuse sensory
inputs and actuate audio-visual responses in real-time, while providing appropriate QoS guarantees. Therefore, we
propose to develop an adaptive and programmable ARchitecture for Interactive Arts (ARIA). ARIA will be capable
of real-time sensing and streaming various types of audio, video, and motion data, accessing external data sources,
extracting various features from streamed data, and fusing and mapping streams onto output devices as constrained by
the QoS requirements. ARIA media-flow architecture will provide a run-time kernel which is a quality-adaptive and
programmable media-flow network consisting of fusion operators and media-stream integration paths.
Visual Design Language Interface
ARIA will provide the choreographer with a visual design tool to describe the various processing components, such as
sensors and display elements, in the interactive performance. The specifications will include streaming characteristics of
the sensors; precisions and computational overheads of the feature extractors; schemas, interfaces, and data access costs
of the external data sources; functionality and QoS of the fusion operators and integration pathways in the media-flow
network; and display and audio features of the actuators. The language will also provide mechanisms for describing local
(per-processing-component) and end-to-end (sensor-to-actuator) delay, QoS, and frequency constraints. As discussed
in the related work section, there are various data- and media-flow languages, most of which rely on visual tools. We
will develop a visual language to specify and design media-flow networks, building on existing languages (such as Max)
used in the domain of media flow specification. This language will be more expressive than the existing languages, in
that it will allow for description of QoS constraints along with the media-flow connectivity.
Back to Top
Formal Specification of the Architecture
The ARIA media-flow architecture is modeled as a flow network, G(V,E), where
- the set, V, of vertices (or nodes) represents the set of sensors,
filters, fusion operators, external sources, and actuators;
- the set, E, of edges represents media streams between components.
Figure2: An example media flow network
Figure 2 provides an example flow network. Given two vertices vs and ve, a flow path,
path(vs, ve), between these two vertices is a sequence,
<vs, ..., ve> of vertices,
such that each consecutive vertex on the sequence is connected by an edge.
If there is no such a path between vs and ve , then
path(vs , ve ) = < >. A flow cycle,
then, is a flow path, where vs = ve . A flow graph without any cycles is called an acyclic flow graph
and one with cycles is called a cyclic flow graph. In the rest of
the section, we formally describe the components that constitute the
quality-adaptive media-flow network.
Back to Top
In ARIA, the basic information unit is an object. An object is produced by a
sensor or an external data source. An object does not exist in isolation; it
is a part of an object stream (or stream) as discussed below. An ARIA media flow
network can consist of multiple streams of objects. Depending on the task,
an object can be as simple as a numeric value (such as an integer denoting
the pressure applied on a surface sensor) or as complex as an image component
segmented out from frames in a video sequence. Each object streamed through
- an object payload, such as a string, a numeric value, or an image
- a meta-data header, that describes the object properties:
-size of the object data that defines the memory or buffer requirement
for the object,
-precision of the object data (for
different object types, precision may mean different things; for an image,
its precision may mean its resolution, whereas for a coordinate value, it
may mean the level of confidence provided by the object tracking sensors),
-set of buffer usage stamps of the object as it is transformed
through filtering and fusion operations, and
-set of timestamps
acquired by the object each time it went through sensing, filtering, and
fusion operations. The total time delay incurred by the object can be
calculated from the set of time stamps.
We denote the set of all object payload types using
O. In the
rest of the proposal, we refer to object payload types as object types.
Back to Top
In ARIA, each stream denotes a transmission channel of objects of the same type.
We classify streams into two types: regular and irregular streams.
Regular Streams A regular stream is a data stream, where all objects
have a constant frequency, resource requirement, and precision.
- For example, a sequence of 2-byte surface pressure values, measured within
99.9% precision and generated every 10 milliseconds by a floor sensor, forms
a regular object stream.
Consequently, given an object type O
O, we can represent a regular stream
type as a quadruple <O, f, b, p>, where f is the frequency with which data is
available in the stream, b is the resource requirement of each object, and p
A [0, 1]
is the precision of the data in the stream. The set of all regular stream
types (of objects of type O) is denoted as
Irregular Streams An irregular stream, on the other hand, is a data
stream where object frequency, resource requirement, and precision are
varying or even unpredictable.
Given an object type O A
O, we represent an irregular stream type as
<O, f, b, p>,
where f is the expected frequency with which data is available, b is the
expected resource requirement of each object, and p is the expected precision of
the objects in the stream. Note that in addition to the expected frequency
value, other stochastic properties of the streams can also be described. In
this proposal, for the sake of clarity, we are limiting the discussion to
expected frequencies of the streams. If the stream does not have an expected
frequency, then f = v2. If the objects in the stream do not have an expected
precision, then p = v and if the objects in the stream do not have an expected
size, then b = v. The set of all irregular stream types (of objects of type O)
is denoted as IS(O).
- For example, let us consider a face extraction module that recognizes and
segments faces in a video feed and makes the resulting face-objects
available for further processing. Since the availability of face-objects in the
resulting stream depends on the video feed and since sizes of the objects
depend on the closeness of the faces to the camera, the resulting stream is
Since any stream is either regular or irregular, the set
of all stream types (of objects of type O)is
AS(O) = RS(O) U IS(O)
Back to Top
Sensors record and transmit various features of the real world environment. In
the ARIA architecture, sensors act as stream sources. Each sensor (of object
type O), n(O), has a corresponding set,
AS(O), of stream types that it can
generate. However, at any given point in time a sensor can generate only one
- For example, consider a motion sensor that can either
provide high-precision motion information every 100 milliseconds or low-precision motion
information every 10 milliseconds. This sensor has two stream types.
A sensor, n(O) = <S(O)>, can make objects of type O available at different
frequencies, sizes, and precisions. Consequently, a scalable sensor may
need to consider the trade-off between object availability, resource
requirements, and quality. Sensors that deliver only regular streams are
called regular sensors, whereas those that deliver irregular streams are
called irregular sensors.
Back to Top
While sensors generate object streams, actuators consume object streams and
map them to appropriate visual, aural, or haptic outputs.
- For example, consider a monitor that can display the face-objects delivered
to it and can change the picture on the screen every 2secs. This monitor is
an actuator with a frequency upper-limit.
Each actuator (of object type O), α(O), is of
the form α(O) = <S(O)>,
where S(O) is the
set of object streams that the actuator can accept. Similar to the sensor, an
actuator can only accept one object stream at a given time.
Back to Top
A filter takes an object stream as input, processes and transforms the
objects in the stream, and outputs a new stream consisting of the
- For example, consider a module that takes a stream of face-objects as its
input and returns the emotion of the face as its output. This module is a
transforming filter. Note that the precision of the result may depend on the
number of consecutive faces considered or may depend on what type of a
heuristic/ neural-net is used. Consequently, the filter may provide multiple
precisions, each with its own end-to-end delay and resource requirement.
More formally, a transforming filter, tf(O, O'), takes a stream of type S = <O, f, b,
and returns a stream of type S' = <O',
f', b', p'>
AS(O'). Each transforming filter,
P, of precision factors it can operate at. Each precision factor,
P, describes the degree of precision change due to the filtering operation, i.e.,
ρ = p'/p.
D, of end-to-end delays. For each precision factor
P, there exists an end-to-end delay
R, of resource
requirements. For each precision factor
P, there is a resource requirement
an input frequency range,
Consequently, each filter, tf(O, O'), is a tuple of the form <O, O',
F>. There are some
special filters that do not necessarily operate on the "content" of the
objects in the stream, but on their higher-level properties or attributes, such
as frequencies, precisions, or buffer requirements. Obviously, while
altering higher-level properties of the streams they may also alter the
contents of the objects. Since these operators may specifically be used in
ensuring that the streams satisfy local operator and global end-to-end QoS
constraints, we will treat them separately. The special filters consist of
Frequency-Scale Filters A frequency-scale filter,
fsf∏, is a special filter
which, given a stream s = <O, f, b, p>, returns a stream
s' = <O, f B
∏, b', p'>. In other words, it
modifies the frequency of the data in the stream as a function of the
original data frequency.
Precision-Scale Filters A precision scale filter, pspρ, is a special
filter which, given a stream s = <O, f, b, p>, outputs a stream
s' = <O, f', b', p
ρ>. In other
words, it modifies (improves or degrades) the precision of the objects that
pass through the filter. Note that if
p=v, then the precision-scale filter
changes the precision of each individual object by a factor of
the resulting stream does not have an expected precision.
Resource-Scale Filters A resource-scale filter bsbк, is a special filter
which, given a stream
s = <O, f, b, p>, outputs a stream
s' = <O, f', b