Abstracts of Selected Publications by Stephen T. Pope
Topics
Best-sellers
The Big MAT Book: Courseware for Audio & Multimedia Engineering
(in 3 volumes)
MAT/CREATE, 2008, 665 pages
Multimedia engineering is a broad and complex topic. It is also one of
the fastest-growing and most valuable fields of research and development
within electronic technology. The book before you is an anthology of
curriculum materials developed over the space of 12 years at the
University of California, Santa Barbara for students in UCSB’s Graduate
Program in Media Arts and Technology.
The BigMATBook consists of the presentation slides for eleven ten-week
courses, amounting to almost 500 hours of presentation time. For each
of the eleven courses, the presentation slides are accompanied by the
tables of contents of the course readers, and an overview of the
example code archives. These resources are available for down-load from
the MAT or HeavenEverywhere web sites (see
http://HeavenEverywhere.com/TheBigMATBook).
The multimedia engineering courses included here cover theory and
practice, hardware and software, visual and audio media, and arts as
well as entertainment applications. Some of the courses (the first two
chapters) are required of all MAT graduate students, and thus must
target less-technical and also non-audio-centric students. The bulk of
this material, though, consists of elective courses that have somewhat
higher-level prerequisites and assume basic knowledge of acoustics and
some (minimal) programming experience in mainstream programming
languages.
Get the PDF file
The Allosphere: An Immersive Multimedia Instrument for Scientific
Data Discovery and Artistic Exploration (with Xavier Amatriain, JoAnn
Kuchera-Morin and Tobias Hollerer)
IEEE Transactions on Multimedia, 2008.
The UCSB Allosphere is a 3-story high spherical space in which fully
immersive environments can be experienced. It allows for the
exploration of large-scale data sets in an environment that is at the
same time multimodal, multimedia, multi-user, immersive, and
interactive. The Allosphere is being used for research into scientific
visualization/auralization and data exploration but also as a research
environment for behavioral/cognitive scientists and artists. The
facility consists of a perforated aluminum sphere, ten meters in
diameter, suspended inside a near-anechoic cube. The Allosphere is
being equipped with high-resolution active stereo projectors, a
complete 3D sound system with hundreds of speakers and novel
interfaces. Once fully equipped it will enable seamless immersive
projection and 3D audio. In this article we give an overview of the
purpose of the instrument as well as the systems that are being put in
place to equip such a unique environment. We also review the first
results and experiences in developing and using the Allosphere in
several prototype projects.
Get the PDF file
“The Acoustics of a large 3D Immersive Environment: The Allosphere
at UCSB,” (with D. Conant, T. Hoover and K. McNally)
Proc. 2008 ASA-EAA Joint Conference
on Acoustics. Paris.
The Allosphere is a new audio/visual immersion space for the California
Nanosystems Institute at the University of California, Santa Barbara,
used for both scientific and performing-arts studies. This 3-story
sphere with central-axis catwalk permits at unusually large
experiential region. The huge perforated-metal visual projection
sphere, with its principle listening locations centered inside the
sphere, introduces multiple considerations and compromises, especially
since the ideal acoustical environmental is anechoic. Video projection
requires opaque light reflectivity of the concave projection surface,
while audio solicits extreme sound transmissibility of the screen plus
full-range sound absorptivity outside the sphere. The design requires
high-fidelity spatialization of a large number of simulated sound
sources over a large region near the core, and support of vector-based
amplitude panning, Ambisonic playback, and wave-field synthesis. This
paper discusses considerations that both conform to, and lie outside
of, traditional acoustical analysis methodologies, and briefly reviews
the electroacoustic systems design.
Get
the PDF file
“Interchange Formats for Spatial Audio”
(invited position paper) Proc. 2008
Int’l Computer Music Conference (ICMC), Belfast.
Space has been a central parameter in electroacoustic music composition
and performance since its origins. Nevertheless, the design of a
standardized interchange format for spatial audio performances is a
complex task that poses a diverse set of constraints and problems. This
position paper attempts to describe the current state of the art in
terms of what can be called “easy” today, and what areas pose as-yet
unsolved technical or theoretical problems. The paper ends with a set
of comments on the process of developing a widely useable spatial sound
interchange format.
Get the PDF file
Scripting and Tools for Analysis/Resynthesis of Audio
Proceedings of the 2007 International
Computer Music Conference.
Software tools for audio analysis, signal processing and synthesis come
in many flavors; in general they fall into one of two categories:
interactive tools with limited extensibility, or non-graphical
scripting languages. It has been our attempt to combine the best
features of these two worlds into one framework that supports both (a)
the easy development of GUI-based applications for digital audio signal
processing (DASP), and (b) an extensible text-based scripting language
with built-in libraries for DASP applications. The goal is to combine
the good performance of optimized low-level code for the signal
processing number-crunching, with a powerful, flexible scripting
language and GUI construction tools for application development. We
investigate the solutions to this dilemma on the basis of four concrete
examples in which DASP tools have been used together with the Siren
music/sound package for Smalltalk.
Get the PDF file
Teaching Digital Audio Programming: Notes on a Two-year Course
Sequence
Proceedings of the 2007
International
Computer Music Conference.
The MAT 240 Digital Audio Programming course sequence is a six-quarter
(i.e., two-year) practical workshop class devoted to teaching digital
audio processing techniques and software development at the graduate
level. It has been delivered through several complete iterations at
UCSB since 2000. In this paper, we will introduce the course sequence
topics, describe what students actually do and learn in the course, and
evaluate our challenges, successes and failures.
Get the PDF file
Immersive Audio and Music in the Allosphere (with Xavier Amatriain,
Tobias
Hollerer, and JoAnn Kuchera-Morin)
Proceedings
of the 2007 International
Computer Music Conference.
The UCSB Allosphere is a 3-story-high spherical instrument in which
virtual environments and performances can be experienced in full
immersion. It is made of a perforated aluminum sphere, ten meters in
diameter, suspended inside an anechoic cube. The space is now being
equipped with high-resolution active stereo projectors, a 3D sound
system with several hundred speakers, and with tracking and interaction
mechanisms. The Allosphere allows for the exploration of large-scale
data sets in an environment that is at the same time multimodal,
multimedia, multi-user, immersive, and interactive. This novel and
unique instrument will be used for research into scientific
visualization/auralization and data exploration, and as a research
environment for behavioral and cognitive scientists. It will also serve
as a research and performance space for artists exploring new forms of
art. In particular, the Allosphere has been carefully designed to allow
for immersive music applications. In this paper, we give an overview of
the instrument, focusing on the audio subsystem. We present first
results and our experiences in developing and using the Allosphere in
several prototype projects.
Get the PDF file
The Siren 7.5 Package for Music and Sound in Smalltalk
MAT/CREATE Internal Report,
2007
Siren is a programming framework for developing music/sound
applications in the Smalltalk programming system. It has been under
development for more than 20 years, and the newest version (7.5) has a
collection of major updates and new subsystems. This paper briefly
introduces Siren, and then concentrates on the significant new
features, interfaces, and applications in Siren 7.5.
Get the PDF file
Software Models and Frameworks
for
Sound Composition, Synthesis, and Analysis: The Siren, CSL, and MAK
Music Languages
Anthology, June, 2005, updated May,
2007, 462 pages
Music is an undeniably complex phenomenon, so the design of abstract
representations, formal models, and description languages for
music-related data can be expected to be a rich domain. Music-making
consists of a variety of diverse activities, and each of these presents
different requirements for developers of new abstract and concrete data
formats for musician users.
The topic of this work is the design of formal models and
languages for a set of common musical activities including (but not
limited to) composition, performance and production, and semantic
analysis. The background of this work is the 50-year history of
computer music programming languages, which began with low-level and
(by today’s standards) simplistic notations for signal synthesis
routines and compositional algorithms. Over these 50 years, many
generations of new ideas have been applied to programming language
design, and the topics of formal modeling and explicit knowledge
representation have arisen and taken an important place in computer
science, and thus in computer music.
The three concrete systems presented in this anthology have been
developed and refined over a period of 25 years, and address the areas,
respectively, of (a) music composition (Siren), (b) sound synthesis and
processing (CSL), and (c) music data analysis for information retrieval
(MAK). In each successive generation of refinement of these concrete
languages, the underlying models and metamodels have been considered
and incrementally merged, so that the current-generation (Siren 7, CSL
4 and MAK 4) share both superficial and deep models and
expressive facilities. This allows the user (assumed to be a composer,
performer, or musicologist) to share data and functionality across
these domains, and, as will be demonstrated, to extend the models and
frameworks into new areas with relative ease.
The significant contributions of this work to the literature can be
found in (a) the set of design criteria and trade-offs developed for
music language developers, (b) the new object-oriented design patterns
for computer music systems, and (c) the trans-disciplinary design of
the three specific languages for composers, performer/producers, and
musicologists presented here. Get the PDF
file
MODE & Siren: Smalltalk and Music
The Siren 7.5 Package for Music and Sound in Smalltalk
MAT/CREATE Internal Report,
2007
Siren is a programming framework for developing music/sound
applications in the Smalltalk programming system. It has been under
development for more than 20 years, and the newest version (7.5) has a
collection of major updates and new subsystems. This paper briefly
introduces Siren, and then concentrates on the significant new
features, interfaces, and applications in Siren 7.5.
Get the PDF file
Metamodels and Design Patterns in CSL4 (with Xavier Amatriain,
Lance
Putnam, Jorge Castellanos, and Ryan Avery)
Proceedings of the 2006 International
Computer Music Conference
The task of building a description language for audio synthesis
and processing consists of balancing a variety of conflicting demands
and constraints such as easy learning curve, usability, flexibility,
extensibility, and run-time performance. There are many alternatives as
to what a modern language for describing signal processing patches
should look like. This paper describes the object-oriented models and
design patterns used in version 4 of the CREATE Signal Library (CSL), a
full rewrite that included an effort to use concepts from the ”4MS”
metamodel for multimedia systems, and to integrate a set of design
patterns for signal processing. We refer the reader to other
publications for an introduction to CSL, and will concentrate on design
and implementation choices in CSL4 that simplify the kernel classes,
improve their performance, and ease their extension while using
best-practice software engineering techniques.
Get the PDF file
Recent Developments in Siren: Modeling, Control, and Interaction
for Large-scale Distributed Music Software (with Chandrasekhar
Ramakrishnan)
Proceedings of the 2003 International Computer Music Conference.
This paper describes recent advances in platform-independent
object-oriented software for music and sound processing. The Siren
system is the result of almost 20 years of continuous development in
the Smalltalk programming language; it incorporates an abstract music
representation language, interfaces for real-time I/O in several media,
a user interface framework, and connections to object databases. To
support ambitious compositional and performance applications, the
system is integrated with a scalable realtime distributed processing
framework. Rather than presenting a system overview (Siren is
exhaustively documented elsewhere), we discuss the new features of the
system here, including its integration with new DSP frameworks, new I/O
interfaces, and its use in several recent compositions.
Get the PDF file
Music and Sound Processing in Squeak Using Siren
Invited Chapter in
Squeak: Open Personal Computing and Multimedia edited by
Mark Guzdial and Kim Rose. Prentice-Hall, 2002.
The Siren system is a general-purpose music composition and
production
framework integrated with Squeak Smalltalk (1); it is a Smalltalk class
library of about 200 classes for building musical applications. Siren
runs
on a variety of platforms with support for real-time MIDI and
multi-channel
audio I/O. The system's source code is available for free on the
Internet;
see the Siren home page at the URL
http://www.create.ucsb.edu/Siren.
This chapter concentrates on (a) the Smoke music description
language,
(b) the real-time MIDI and sound I/O facilities, and (c) the GUIs for
the
2.7 version of Siren. It is intended for a Squeak programmer who is
interested
in music and sound applications, or for a computer music enthusiast who
is interested in Squeak applications.
Get the PDF file
The Musical Object Development Environment (MODE)--Ten Years of
Music Software in Smalltalk
Proceedings of the 1994 International Computer Music Conference.
The author has developed a family of software tool kits for
composers
with the Smalltalk-80 programming sys tem over the last decade. The
current
MODE Version 2 system supports structured composition, flexible
graphical
editing of high- and low-level musical objects, real-time MIDI I/O,
software
sound synthesis and processing, and other tasks. This poster will
introduce
the MODE and SmOKe, its representation language, and survey the various
end-user applications it includes. The discussion will evaluate the
system's
performance and requirements.
Get the PDF file
The Interim DynaPiano: An Integrated Tool and Instrument for
Composers
Computer Music Journal 16:3, Fall, 1992, 21 p.
The Interim DynaPiano (IDP) is an integrated computer
hardware/software
configuration for music composition, production, and performance based
on a Sun Microsystems Inc. SPARCstation computer and the Musical Object
Development Environment (MODE) software. The IDP SPARCstation is a
powerful
hardware-accelerated color graphics RISC- (reduced instruction set
computer)
based workstation computer running the UNIX operating system. It is
augmented
by large RAM and disk memories and coprocessors and interfaces for
real-time
sampled sound and MIDI I/O. The MODE is a large hierarchy of
object-oriented
software components for music written in the Smalltalk-80 language and
programming system. MODE software applications in IDP support flexible
structured music composition, sampled sound recording and processing,
and
real-time music performance using MIDI or sampled sounds.
The motivation for the development of IDP is to build a powerful,
flexible,
and portable computer-based composer's tool and musical instrument that
is affordable by a professional composer (i.e., around the price of a
good
piano or MIDI studio). The hardware and low-level software of the
system
consist entirely of off-the-shelf commercial components. The goal of
the
high-level and application software is to exhibit good object-oriented
design principles and elegant modern software engineering practice. The
basic configuration of the system is consistent with a whole series of
"intelligent composer's assistants" based on a core technology that has
been stable for a decade.
This article presents an overview of the hardware and software
components
of the current IDP system. The background section discusses several of
the design issues in IDP in terms of definitions and a set of examples
from the literature. The hardware system configuration is presented
next,
and the rest of the article is a description of the MODE signal and
event
representations, software libraries, and application examples.
Get
the PDF file
The SmOKe Music Representation, Description Language, and
Interchange Format
Proceedings of the 1992 International Computer Music Conference.
The Smallmusic Object Kernel (SmOKe) is an object-oriented
representation,
description language and interchange format for musical parameters,
events,
and structures. The author believes this representation, and its
proposed
linear ASCII description, to be well-suited as a basis for: (1)
concrete
description interfaces in other languages, (2) specially-designed
binary
storage and interchange formats, and (3) use within and between
interactive
multimedia, hypermedia applications in several application do mains.
The
textual versions of SmOKe share the terseness of note-list-oriented
music
input languages, the flexibility and extensibility of "real" music
programming
languages, and the non-sequential description and annotation features
of
hypermedia description formats.
This description defines SmOKe's basic concepts and constructs, and
presents examples of the music mag nitudes and event structures. The
intended
audience for this discussion is programmers and musicians working with
digital- technology-based multimedia tools who are interested in the
design
issues related to music representations, and are familiar with the
basic
concepts of software engineering. Two other documents ([Smallmusic
1992]
and [Pope 1992]), describe the SmOKe language, and the MODE environment
within which it has been implemented, in more detail.
Get the PDF file
Modeling Musical Structures as EventGenerators
Proceedings of the 1989 International Computer Music Conference.
There is a broad range of music description languages. The
common
terms for describing musical structures define a vocabulary that every
musician learns as part of his or her training. The terms we take for
granted
in de scribing music can be used for building generative software
description
languages. This paper describes recent work modeling higher-level
musical
structures in terms of objects that understand specialized
sub-languages
for creation of-and interaction with-musical structures. The goal is to
provide tools for composers to describe compositions by incrementally
refining
the behaviors of a hierarchical collection of structure models.
Get the PDF file
T-R Trees in the MODE (A Tree Editor Based Loosely on Fred's Theory)
Proceedings of the 1991 International Computer Music Conference.
The T-R Trees software system is a set of software tools for the
graphical and programmatic manipulation of expressive and structural
hierarchies
in music composition. It is loosely based on the hierarchies described
in Fred Lerdahl and Ray Jackendoof's landmark book A Generative
Theory
of Tonal Music--weighted grouping and prolongational reduction
trees
(also called tension-relaxation or T-R trees). This article describes
T-R
tree derivation, editing, and application in score representation and
management.
Get the PDF file
Distributed Processing
The Distributed Processing Environment for High-Performance
Distributed Multimedia Applications (with Andreas Engberg, Frode Holm,
and Ahmi Wolf)
Proc. 2001 IEEE Multimedia Technology and Applications Conference
Our group is involved in implementing large-scale multimedia software
for application areas ranging from multi-user virtual worlds to complex
real-time sound synthesis. We call this class of system
High-Performance Distributed Multimedia (HPDM) software. The
Distributed Processing Environment (DPE) is an infrastructure for
configuring and managing HPDM software. It consists of several
components that allow the start-up, monitoring, and shut-down of
software services on a network. This report describes the design and
implementation of the prototype DPE system, which we built for the ATON
project.
Get the PDF file
The Real-time (Multimedia) Interface Description Language: RIDL
(with Andreas Engberg and Frode Holm)
Proc. 2001 IEEE Multimedia Technology and Applications Conference
The Real-time Multimedia Interface Description Language—RIDL—is an
extension of the CORBA IDL for use in building distributed real-time
multimedia software systems. We designed RIDL to integrate
quality-of-service (QoS) information, as well as configuration
requirements, into the IDL interface descriptions of our software
components. We have built a flexible first-generation RIDL compiler and
associated repositories.
Get the PDF file
All About CRAM: The CREATE Real-time Application Manager
CREATE Internal Report
The CREATE Real-time Applications Manager (CRAM) is a framework for
developing, deploying, and managing distributed real-time software. It
has evolved in our group at UCSB through three implementations over the
space of five years. The background of CRAM is the work done since the
early 1990s on distributed processing environments (DPEs), which
started in the telecommunications industry (see Appendix 1). CRAM is
unusual among DPEs in that it is very light-weight and efficient, but
also fault-tolerant, and that it supports both planning-time and
run-time load balancing as required by real-time applications. Its main
application areas to date are large-scale music performance systems and
distributed virtual environments. Get the
PDF file.
ATON Report 2001.06.1: ATON/UCSB Final Report
CREATE Internal Report
The ATON Project was an ambitious, large-scale, multi-year R&D
effort undertaken by three teams collaborating across several
disciplines. The original project description (see the ATON web site
http://www.create.ucsb.edu/ATON/overview.html) stated, “The project
involves topics as diverse as robotics, computer vision, distributed
multimedia processing, and virtual reality.“ For the ATON system, we
need to build a virtual environment (VE) that allows one or more users
to control robots and video cameras located anywhere in the state of
California, and to “see through the eyes” of the robots to manage
traffic incidents. This implies a kind of widearea distributed
real-time multimedia system that we call High-Performance Distributed
Multimedia (HPDM) software. This report summarizes the work carried out
in the CREATE Lab at UCSB as part of the DiMI ATON Project between 1999
and 2001. We describe the background of the ATON Project, and discuss
our efforts, relating them to our published reports and concrete
deliverables. Get
the PDF file.
Computer Music and Music Composition
Producing Kombination XI: Using Modern Hardware
and Software Systems for Composition
Leonardo Music Journal, 2(1): 23-28, 1992.
This article discusses two topics related to the realization of my
composition "Kombination XI: A Ritual Place for Live and Processed
Voices."
These are the score's structure representation language and the
software
tools for manipulating it using graphical structure editors, and the
process
of realization using several different digital signal processing
software
and hardware systems. The reason for focusing on the first issue is the
attempt to built a notation and set of software tools based on weighted
trees that span the expressive and structural domains of music. The
second
topic is of interest as an example of the possibility of using several
types of computer hardware and software in consort as one instrument.
Numerous
score and structure description and editing examples, and documentation
of the realization process are presented.
Get the PDF file
Fifteen Years of Computer-assisted Composition
Proceedings of the 2nd Brazilian Symposium on Computer Music,
1995.
This paper describes several generations of computer music
systems
and the music they have enabled. It will introduce the software tools
used
in some of my music com positions realized in the years 1979-94 at a
variety
of studios using various software and hardware systems and programming
languages. These tools use a wide range of compositional methods,
including
(among others): high-level graphical notations, lim ited stochastic
selection,
Markov transition tables, forward-chaining expert systems,
non-deterministic
Petri networks, and hierarchical rule-based knowledge systems. The
paper
begins by defining several of the terms that are frequently used in the
computer music literature with respect to computer-aided composition
and
realization, and intro duces several of the categories of modern models
of music composition. A series of in- depth examples are then drawn
from
my works of the last 15 years, giving descriptions of the models, the
software
tools, and demonstrating the resulting music.
Get the PDF file
Computer Music Workstations I Have Known and Loved
Proceedings of the 1995 International Computer Music Conference.
This paper introduces a set of design criteria and points
of
current
debate in the development of computer music workstations. It surveys
the
systems of the last ten years and makes several subjective comments on
the design and implementation of computer-based tools for music
composition,
production, and live performance. The intent is to focus the reader's
attention
on the issues of hardware architecture and soft ware support in
defining
computer-based tools and instruments.
Get the PDF file
Why is Good Electroacoustic Music So Good? Why is Bad
Electroacoustic Music
So Bad?
(expanded version of the Editor's Note in CMJ 18:3 with responses).
YLEM
Newsletter 15:4 (July/August, 1995), 4 p.
Get the ASCII text file
Real-Time Performance via User Interfaces to Musical Structures
Proceedings of the Int'l Workshop on Man-Machine Interaction in Live
Performance, Pisa, Italy, June, 1991. reprinted in
Interface 22(3): 195-212. 9 p.
This informal and subjective presentation will introduce and
compare
several software systems written by the myself and others for computer
music composition and perfor mance based on higher-level abstractions
of
musical data structures. I will then evaluate a few of the issues in
real-time
interaction with structural descriptions of musical data.
The premise is that very interesting live-performance software
environments
could be based in existing technology for structural music description,
but that much of the current real-time performance-oriented software
for
music is rather limited in that it supports only very low-level notions
of musical structures.The examples will demonstrate various systems for
graphical interaction with procedural, knowledge-based, hierarchical
and/or
stochastic music description systems that could be used for live
performance.
Get the PDF file (without figures) Read
the HTML version (*with* figures)
Web.La.Radia: Social, Economic, and Political Aspects of Music and
DigitalMedia
Invited Paper, Salzburg Symposium on
New Media Technology and
Networking
for Creative Applications (1997). Reprinted in Proceedings of the 1997
International Computer Music Conference, Thessaloniki. Reprinted in
Computer
Music Journal 23:1, Spring, 1999, 10 p.
This informal essay addresses the current status and trajectory of
media
art and media technology. In formulating my ideas on these topics, I
found
myself being drawn away from my usual technical concerns, and
increasingly
to the sociology, economics, and political relationships of electronic
media art and its modes of production and dissemination. There are
several
rather bold statements below on the subject of new media art and
art-making
on the world-wide web, and I rely heavily on a series of quotes taken
from
the literature to make my points, without the implication that I
necessarily
agree with every one of them. I take a critical stance in these
comments,
but still do not wish to be considered a ìweb-Luddite.î I
use the web daily,
and it is a major component of my research. On the other hand, I am
very
concerned by several trends I see in the web culture and feel that it
is
necessary to draw attention to them.
Get the PDF File
Music Information Retrieval and Databases
Feature Extraction and Database Design for Music Software (with
Frode Holm and Alexandre Kouznetsov)
Proceedings of the 2004 International
Computer Music Conference
Persistent storage and access of sound/music meta-data is an
increasingly relevant topic to the developers of multimedia software.
This paper focuses on the design of music signal analysis tools and
database formats for modern applications. It is partly tutorial in
nature, and partly a discussion of design issues. We begin with a
high-level overview of the dimensions of music database (MDB) software,
and then walk through the common feature extraction techniques. A
requirements analysis of several application categories will allow us
to carefully determine which features might be most useful for them.
This leads us to suggest concrete architectural and design criteria,
and to close by introducing several of our recent implemented systems.
The authors believe that much current MDB software suffers due to
ad-hoc design of analysis systems and feature vectors, which often
incorporate only low-level features and are not tuned for the
application at hand. Our goal is to advance the state of the art of
music meta-data extraction and database design by fostering a better
engineering practice in the construction of high-level feature vectors
and analysis engines for music software.
Get
the PDF file
The FASTLab Music Analysis Kernel
FASTLab Internal Report
The FASTLab Music Analysis Kernel (FMAK) is a software package for
building and using music and sound databases. It consists of four main
interfaces: analysis, segmentation, clustering, and classification. The
FMAK analyzer computes both low-level and high-level features (called
feature vectors or meta-data) from musical selections. The segmenter
takes these feature vectors and finds the phrase, verse, and section
breaks in music, thus discovering the musical form and allowing us to
reduce the number of feature vectors we need to store. The clustering
functions support data mining in large databases of feature vectors by
grouping the data into well-defined genre clusters. The classifier adds
customizable database pruning and run-time distance metrics for using
genre databases. These four components can be used in a variety of ways
to build software applications that processes large volumes of
multimedia data.
Get
the PDF file
Expert Mastering Assistant (EMA) Version 2.0 Technical
Documentation (with Alex Kouznetsov)
FASTLab Internal Report
This document describes the design, and implementation of the “Expert
Mastering Assistant” (EMA) tool version 2.0 developed by UCSB Center
for Research in Electronic Art Technology (CREATE), and FastLAB Inc.
for Panasonic Spin-Up Fund. The “expert mastering assistant” (EMA) is a
prototype artificial-intelligence-based software tool that “listens” to
a set of musical selections and gives expert advice to a mastering
engineer, suggesting parameters for signal processing modules that
perform the signal processing: equalization, compression,
reverberation, etc. EMA suite consists of two major components: the
interactive EMA application that analyses and processes individual
songs with real-time interactivity, and a number of development
applications that are required as a part of the expert system training
process (Figure 1).
Get the PDF file
The Open Music Network Infrastructure (OMNI)
CREATE Internal Report
This proposal describes the Open Music Network Infrastructure (OMNI),
an Internet-based music service that aims to provide music content
providers with a new forum in which to attract music consumers,
enabling the so-called “second music industry.” The OMNI system
consists of a content provider interfaces, a large-scale
artificial-intelligence-assisted “smart” music/sound database, and
listener services that allow users to select musical selections based
on their personal taste. The most unique feature of OMNI relative to
other web-based music services is this use of a smart indexing and
search component in the database, which facilitates little-known
musicians finding an audience that would like their songs. This
document is aimed at a semi-technical reader.
Get the PDF file
Content Analysis and Queries in a Sound and Music Database
Proceedings of the 1999 International Computer Music Conference.
The Paleo database project at CREATE aims to develop and deploy a
large-scale integrated sound and music database that supports several
kinds of content and analysis data and several domains of queries. The
basic components of the Paleo system are: (1) a scalable
general-purpose object database system, (2) a comprehensive suite of
sound/music analysis (feature extraction) tools, (3) a distributed
interface to the database, and (4) prototype end-user
applications The Paleo system is based on a rich set of signal
and event analysis
programs for feature extraction from sound and music data. The premise
is that, in order to support several kinds of queries, we need to
extract a wide range of different kinds of features from the data as it
is loaded into the database, and possibly to analyze still more in
response to queries. The results of these analyses will be very long
³feature vectors² (or multi-level indices) that describe the
contents of the database. To be useful for a wide range of
applications, the Paleo system must allow several different kinds of
queries, i.e., it needs to manage large and changing feature
vectors. As data in the database is used, the feature vectors can
be simplified.
This might mean discarding spectral analysis data for speech sounds, or
metrical grouping trees for unmetered music. This is what sets Paleo
apart from most other media database projects‹the use of complex and
dynamic feature vectors and indices. This paper introduces the
Paleo system¹s architecture, and then
focusses on three issues: the signal and event analysis routines, the
use of constraints in analysis and queries, and the object storage
layer and formats. Some examples of Paleo usage are also given. Get the PDF
file of the text Get the PDF File
of the presentation slides
Spatial and 3-D Sound Systems
Immersive Audio and Music in the Allosphere (with Xavier Amatriain,
Tobias
Hollerer, and JoAnn Kuchera-Morin)
Proceedings
of the 2007 International
Computer Music Conference.
The UCSB Allosphere is a 3-story-high spherical instrument in which
virtual environments and performances can be experienced in full
immersion. It is made of a perforated aluminum sphere, ten meters in
diameter, suspended inside an anechoic cube. The space is now being
equipped with high-resolution active stereo projectors, a 3D sound
system with several hundred speakers, and with tracking and interaction
mechanisms. The Allosphere allows for the exploration of large-scale
data sets in an environment that is at the same time multimodal,
multimedia, multi-user, immersive, and interactive. This novel and
unique instrument will be used for research into scientific
visualization/auralization and data exploration, and as a research
environment for behavioral and cognitive scientists. It will also serve
as a research and performance space for artists exploring new forms of
art. In particular, the Allosphere has been carefully designed to allow
for immersive music applications. In this paper, we give an overview of
the instrument, focusing on the audio subsystem. We present first
results and our experiences in developing and using the Allosphere in
several prototype projects.
Get the PDF file
Audio in the UCSB CNSI AlloSphere
MAT/CNSI Internal Report
The UCSB AlloSphere is a joint effort of the California NanoSystems
Institute (CNSI) and the graduate program in Media Arts and Technology
(MAT) at the University of California Santa Barbara (UCSB). It is
currently under construction, with completion scheduled for the first
half of 2006. The AlloSphere is designed as an immersive computational
interface for 10 to 20 users, featuring surround-sound data
sonification and immersive visualization (i.e., 3D audio and video
projection) on a spherical surface. It will provide interactive control
by the means of microphone arrays, cameras, and mechanical, and
magnetic input tracking. The actual shape of the AlloSphere can be
described as two hemispheres with 16-foot radii pulled 8 feet apart,
placed in a 3-story anechoic chamber. A 7-foot-wide bridge runs across
the center, supporting the users. This document describes the
requirements for the audio component of the AlloSphere, introduces the
three prevalent spatial sound processing technologies in use today, and
outlines the AlloSphere audio input and projection design and
implementation plan, from low-level transducer elements to high-level
network protocols.
Get the PDF file
The State of the Art in Sound Spatialization
There are several aspects to the field of spatial sound,
each of
which
pose different chalenges and offer different potential applications.
Although
our understanding of aural perception is still incomplete, we are able
to both synthesize and record spatial sound fields, and to render sound
such that the fidelity of localization is very high (for a specific
listener).
There are several well-known and effective techniques for creating
the perceptual cues that our brains use to localize sound, but the
systems
that scale well to large spaces or to many listeners are not the same
ones
that give the best localizational fidelity. The formal study of spatial
sound performance in larger space (e.g., concert halls) is still in its
(relative) infancy. Most work in this area has been ad hoc, treating
the
spatial sound performance situation more as an instrumental performance
than as a controlled experiment.
This presentation will explore the aspects of aural perception that
contribute to the difficulties, and the potential, in the recording and
playback of spatial sound, and will survey the current techniques used
in this area.
Get the PDF File
Building Sound into a Virtual Environment: An Aural
Perspective
Engine
for a Distributed Interactive Virtual Environment (An APE for a DIVE).
(with Lennart E. Fahlén)
Report of the Distributed Systems Laboratory of the Swedish
Institute
for
Computer Science, Stockholm, August, 1992.
We have investigated the addition of spatially-localized sound
to an existing graphics-oriented synthetic environment (virtual reality
system). To build "3-D audio" systems that are robust,
listener-independent,
real-time, multi-source, and able to give stable sound localization is
beyond the current state-of-the-art-even using expensive
special-purpose
hardware. The "auralizer" or "aural renderer" described here was built
as a test-bed for experimenting with the known techniques for
generating
sound localization cues based on the geometrical models available in a
synthetic 3-D world. This paper introduces the psychoacoustical
background
of sound localization, and then describes the design and usage of the
DIVE
auralizer. We close by evaluating the system's implementation and
performance.
Get the
PDF file
The Use of 3-D Audio in a Synthetic Environment (with
Lennart E. Fahlén)
Proceedings of the 1993 AIMI
Colloquium, Milan,
Italy.
(See the above abstract.) Get
the PDF file
Machine Tongues--Computer Music
Journal Survey or Tutorial Articles
Machine Tongues XI: Object-oriented Software Design
Computer Music Journal 13(2):9-22, Summer, 1989
Object-oriented programming is a term that represents a
collection
of new techniques for problem-solving and software engineering. Two
previous
articles in this "Machine Tongues" series have introduced
object-oriented
programming, presenting tutorials to this technology, and describing
its
application to music modeling and software development (Krasner 1980,
Lieberman
1982). This paper discusses the new problem-solving techniques that
constitute
the object-oriented design methodology. Object-oriented analysis,
synthesis,
design and implementation are presented, while stressing the issues of
design by analytical modeling, design for reuse, and the development of
software packages in terms of frameworks, toolkits and customizable
applications.
Numerous object-oriented software description examples and
architectural
structures are presented including music modeling, representation and
interactive
applications. This essay will outline object-oriented problem-solving
and software
design in a language independent manner. Examples will be taken
primarily
from the Smalltalk-80 (TM of ParcPlace Systems) programming system, but
the reader need only refer to some of the other articles in this issue
of Computer Music Journal for descriptions of systems based on other
languages
and programming environments. No basic introduction to the terms or
techniques
of object-oriented languages will be presented here. Get
the PDF file
Machine Tongues XV: Three Packages for Software Sound
Synthesis
Computer Music Journal 17(2): 23-54, Summer, 1993
The origin of the technology and methodology of modern
computer
music is certainly the Music V family of software sound synthesis
systems
developed since the late 1950s. In the "old days," this consisted of
batch
computer processing of musical programs expressed in terms of
instrument
definitions (programs) and score note lists (input data), generating
sampled
sound output data to off-line storage for later performance. The
noticeable
rekindling of interest in programs and languages for software sound
synthesis
(SWSS) and software digital audio signal processing (DSP) using
general-purpose
computers is due to a number of factors, not least among them the
dramatic
increase in the power of personal workstations over the last five
years.
There are currently three widely-used, portable,
C-language
SWSS
tools: (in alphabetical order) cmix (Lansky 1990), cmusic (Moore 1990),
and Csound (Vercoe 1991). This article will discuss the technology of
SWSS
and then present and compare these three systems. It is divided into
three
parts; the first introduces SWSS in terms of progressive examples. Part
two compares the three systems using the same two instrument/score
examples
written in each of them. The final section presents informal benchmark
tests of the systems run on two different hardware platforms-a Sun
Microsystems
SPARCstation-2 IPX and a Next Computer Inc. TurboCube machine-and
subjective
comments on various features of the languages and programming
environments
of state-of-the-art SWSS software. Get
the PDF file
Machine Tongues XVIII. A Child's Garden of Sound File
Formats
(with
Guido
Van Rossum)
Computer Music Journal 19(1): 25-63 Spring, 1995.
This article introduces a few of the many ways that sound data
can be stored in computer files, and describes several of the file
formats
that are in common use for this purpose. This text is an expanded and
edited
version of a "frequently asked questions" (FAQ) document that is
updated
regularly by one of the authors (van Rossum). Extensive references are
given here to printed and network-accessible machine-readable
documentation
and source code resources.
Getthe PDF file
Object-Oriented Programming and Design Patterns
Metamodels and Design Patterns in CSL4 (with Xavier Amatriain,
Lance
Putnam, Jorge Castellanos, and Ryan Avery)
Proceedings of the 2006 International
Computer Music Conference.
The task of building a description language for audio synthesis and
processing consists of balancing a variety of conflicting demands and
constraints such as easy learning curve, usability, flexibility,
extensibility, and run-time performance. There are many alternatives as
to what a modern language for describing signal processing patches
should look like. This paper describes the object-oriented models and
design patterns used in version 4 of the CREATE Signal Library (CSL), a
full rewrite that included an effort to use concepts from the ”4MS”
metamodel for multimedia systems, and to integrate a set of design
patterns for signal processing. We refer the reader to other
publications for an introduction to CSL, and will concentrate on design
and implementation choices in CSL4 that simplify the kernel classes,
improve their performance, and ease their extension while using
best-practice software engineering techniques.
Get the PDF file
The Well-Tempered Object: Musical Applications of
Object-Oriented
Software Technology -- A Structured Anthology on Software Science and
Systems
based on Articles from Computer
Music Journal 1980-89
Compiled and edited by Stephen Travis Pope.
Published by MIT Press, 1991
See Well-Tempered
Object Web Page
A Description of the Model-View-Controller User
Interface Paradigm in the Smalltalk-80 System (The MVC Cookbook) (with
Glenn Krasner)
Journal of Object-Oriented Programming 1(3):26-49
This essay describes the Model-View-Controller (MVC) programming
paradigm and
methodology used in the Smalltalk-80TM programming system. MVC
programming is the
application of a three-way factoring, whereby objects of different
classes take over the operations
related to the application domain, the display of the application's
state, and the user interaction
with the model and the view. We present several extended examples of
MVC implementations and
of the layout of composite application views. The Appendices provide
reference materials for the
Smalltalk-80 programmer wishing to understand and use MVC better within
the Smalltalk-80
system.
Get the PDF file
Presentation Slides
Keynote Speech from the CWU Symposium on Undergraduate
Research and
Creative Expression (SOURCE)
Get the slides as a PDF File
See also STP's
SOURCE Links
The State of the Art in "Sound and Music Computing"
Slides for a presentation given at the weekly computer science
colloquium,
UCSB, Feb. 7, 1996. Get
the PDF file
Composition by Refinement
Presentation at the AIMI Conference,
1989.
Description of the use of the HyperScore ToolKit for composition.
Get
the PDF file
Building Large-scaleInteractive Systems with OSC, Siren, CSL, and
CRAM
UC Berkeley AudioIcon Workshop, 2003
Get
the PDF file
CREATE White Papers and Project Reports
Distributed Multimedia Systems R&D at CREATE
Since 1996, the UCSB Center for Research in Electronic Art
Technology (CREATE) has been the home of a series of projects on
distributed
software systems for real-time and multimedia applications. Several
aspects
of our work are relevant to a new classes of applications as more and
more
systems are built using distributed object software technology for
real-time
services. This white paper describes our previous projects and
innovations
in this area and our plans for the future.
Get the PDF file.
Research on Spatial and Surround Sound at CREATE
Researchers at the UCSB Center for Research in Electronic Art
Technology (CREATE) have been developing spatial sound performance
systems
and multichannel surround sound rendering software for several years.
We
use these systems as components of immersive user interfaces for a
variety
of applications, as well as for the performance of spatialized music.
This
white paper surveys our previous work in the field and describes our
plans
for the future. Get the PDF
file.
Research on Music/Sound Databases at CREATE
Large-scale storage of sound and music has only become possible
in the last decade. With this, and the new possibility for wide-area
distribution
of multimedia over the Internet, there arose a new requirement for
flexible
and powerful databases for musical and audio data. Since 1996, our work
at CREATE has focused on database frameworks for multimedia
applications,
and on analysis and feature extraction techniques for music and sound
databases.
This white paper describes our results and presents several of our
plans
for future applications. Get the PDF file.
Application and User Interface Development at CREATE
The history of computer applications in music reaches back
into the 1950s. Only recently, however, has it been possible to control
complex musical processes such as algorithmic composition or
sophisticated
sound synthesis programs in real-time. Advanced software and hardware
technology
also allow us to develop user interfaces that allow non-musicians (and
even non-readers) to be musically creative. These two domains of
application
development and user interface construction have been important tasks
at
CREATE for ten years. We present examples of tools weíve
developed below,
and discuss what features they introduce that might be useful to other
application areas. Get
the PDF file.
The CREATE Signal Library (“Sizzle”): Design, Issues, and
Applications (with Chandrasekhar Ramakrishnan)
Proceedings of the 2003 International Computer Music Conference
The CREATE Signal Library (CSL) is a portable general-purpose software
framework for sound synthesis and digital audio signal processing. It
is implemented as a C++ class library to be used as a standalone
synthesis server, or embedded as a library into other programs. This
first section of this paper describes the overall design of CSL version
3 and gives a series of progressive code examples. We also present
CSL's facilities for network I/O of control and sample streams, and the
development and deployment of distributed CSL systems. What is more
interesting is the discussion that follows of the design issues we
faced in implementing CSL, and the presentation of a few of the
applications in which we've used CSL over the last year. Get the PDF
file.
See Also
Full
Bibliography
List
of
Musical Compositions
Example
Reviews of My Music
Computer
Music Journal WWW/FTP Archives (many music-related links)
Return
to
home page
For more detailed information, mail a letter to STP.
[Stephen Travis Pope, stp@create.ucsb.edu]