Datasources
It is well-known that, in real-world life-cycle studies, most of the time is spent collecting data. You almost surely end up with inventory data like:
id | geo | quantity | ram_size | storage_size | amortization_period | power | co2 | water |
---|---|---|---|---|---|---|---|---|
server-small | FR | 38 | 384 | 61.44 | 5 | 400 | 2855 | 6.46 |
server-medium | FR | 62 | 384 | 11.52 | 5 | 600 | 10155 | 12.6 |
server-large | UK | 3 | 768 | 76.8 | 5 | 900 | 15312 | 24.3 |
Assume that this inventory data is presented as a CSV file data/inventory.csv
located in the folder data
at the root of your project.
To use this data from within your lca as code models, you need first to declare a data source.
datasource inventory {
location = "data/inventory.csv"
schema {
id = "an-identifier"
geo = "FR"
quantity = 1 p
ram_size = 16 GB
storage_size = 1 TB
amortization_period = 5 year
power = 400 W
co2 = 0 kg_CO2_Eq
water = 0 m3
}
}
This expression defines a datasource inventory
that we will be able to query (we will see how below).
The schema block is used to declare which columns are available in the file. In particular,
the identifiers must match the actual column names in the csv file.
Moreover, the schema also specifies the type of value in each column by declaring default values.
For instance, the statement id = "an-identifier"
declares that the column id
will contain values of type string.
The statement ram_size = 16 GB
declares that the column ram_size
will contain numeric values
and that they must be interpreted as a number of gigabytes.
Now, let's see how you can query this datasource.
Lookup
A common use case for data sources is to associate parameter values with, e.g., a specific equipment.
In our example above, you may want to access the quantity of the server identified as server-small
.
The lookup
primitive allows to fetch a specific row from a data source.
process my_lookup {
products {
1 p material
}
variables {
row = lookup inventory match id = "server-small"
quantity = row.quantity
co2 = row.co2
}
impacts {
quantity * co2 co2
}
}
The lookup
primitive returns the first row in the data source that satisfies the matching conditions.
More precisely, the returned value is a record with entries indexed by the columns of the data source.
To access the entry quantity
in the record row
, you can use the dot notation, i.e., row.quantity
.
Note that the lookup
will raise an error if there are no matches.
In case of multiple matches, it will pick arbitrarily one of them.
For each block
Quite often, the elements in an inventory are to be included as inputs to a process. For instance, the inventory in our example lists the equipments in a given data center. How could we model our data center? Of course, one can manually define the model as follows.
process datacenter_manual {
products {
1 p dc
}
variables {
small = lookup inventory match id = "server-small"
medium = lookup inventory match id = "server-medium"
large = lookup inventory match id = "server-large"
}
impacts {
small.co2 co2
medium.co2 co2
large.co2 co2
}
}
This approach, however, is obviously infeasible in case of more than tens or hundreds of rows
in the data source. Instead, one can use a for_each
block.
process datacenter {
products {
1 p dc
}
impacts {
for_each row from inventory {
row.co2 co2
}
}
}
You can also focus on a subset of the rows using matching condition.
process datacenter_fr {
products {
1 p dc
}
impacts {
for_each row from inventory match geo = "FR" {
row.co2 co2
}
}
}
Passing record as parameters
Each record in a data source contains different parameter values.
One may want to pass these values as parameters of another process.
You can define a process that accepts a record as a parameter.
The primitive default_record
returns the default record of a data source,
as defined by the default values in the data source schema.
process server {
// You can define a parameter as a row from inventory.
// The default value for this parameter is given by the schema.
params {
row = default_record from inventory
}
products {
1 p server
}
impacts {
row.co2 co2
}
}
A record can be passed as a parameter like any other parameter.
process pool_server {
products {
1 p pool_server
}
inputs {
for_each row from inventory {
// record variable can be fed to the process invoked.
row.quantity server from server(row = row)
}
}
}
Sum-product
The 'sum' primitive allows compute the sum-product of multiple columns. In the example below, the columns quantity and co2 are multiplied point-wise, and then summed. For now, only the point-wise product of columns is supported.
process sum_prod {
products {
1 p sum_prod
}
impacts {
/*
*/
sum(inventory, quantity * co2) co2
}
}
As an exercise, try to define a process that models the
Solution
process average {
products {
sum(inventory, quantity) server
}
impacts {
sum(inventory, quantity * co2) co2
}
}
Or
process average2 {
products {
sum(inventory, quantity) server
}
impacts {
for_each row from inventory {
row.quantity * row.co2 co2
}
}
}