eafpy.eaf module#

exception eafpy.eaf.ReadDatasetsError(error_code)[source]#

Bases: Exception

Custom exception class for an error returned by the read_datasets function

Parameters:: error_code (int) – Error code returned by read_datasets C function, which maps to a string from

eafpy.eaf.avg_hausdorff_dist(data, ref, maximise=False, p=1)[source]#

Calculate average Hausdorff distance

See igd()

eafpy.eaf.data_subset(dataset, set)[source]#

Select data points from a specific dataset. Returns a single set, without the set number column

This can be used to parse data for inputting to functions such as igd() and hypervolume().

Similar to the subset() function, but can only return 1 set and removes the last column (set number)

Parameters:

dataset (numpy array) – Numpy array of numerical values and set numbers, containing multiple sets. For example the output of the read_datasets() function
Set (integer) – Select a single set from the dataset, where the selected set is equal to this argument

Returns:

returns back a single set with only the objective data. (set numbers are excluded)

Return type:

numpy array

Examples

>>> dataset = eaf.read_datasets("./doc/examples/input1.dat")
>>> data1 = eaf.data_subset(dataset, set = 1)

The above selects dataset 1 and removes the set number so it can be used as an input to functions such as hypervolume() >>> eaf.hypervolume(data1, [10, 10]) 90.46272764755885

See also

subset()

eafpy.eaf.epsilon_additive(data, ref, maximise=False)[source]#

Computes the epsilon metric, either additive or multiplicative.

data and reference must all be larger than 0 for epsilon_mult.

Parameters:

data (numpy.ndarray) – Numpy array of numerical values, where each row gives the coordinates of a point in objective space. If the array is created from the read_datasets() function, remove the last (set) column
ref (numpy.ndarray or list) – Reference point set as a numpy array or list. Must have same number of columns as a single point in the dataset
maximise (bool or list of bool) – Whether the objectives must be maximised instead of minimised. Either a single boolean value that applies to all objectives or a list of booleans, with one value per objective. Also accepts a 1d numpy array with value 0/1 for each objective

Returns:

A single numerical value

Return type:

float

Examples

>>> dat = np.array([[3.5,5.5], [3.6,4.1], [4.1,3.2], [5.5,1.5]])
>>> ref = np.array([[1, 6], [2,5], [3,4], [4,3], [5,2], [6,1]])
>>> eaf.epsilon_additive(dat, ref = ref)
2.5

>>> eaf.epsilon_mult(dat, ref = ref)
3.5

eafpy.eaf.epsilon_mult(data, ref, maximise=False)[source]#

multiplicative epsilon metric

See epsilon_additive()

eafpy.eaf.filter_dominated(data, maximise=False, keep_weakly=False)[source]#: Remove dominated points according to Pareto optimality. See: is_nondominated() for details

eafpy.eaf.filter_dominated_sets(dataset, maximise=False, keep_weakly=False)[source]#

Filter dominated sets for multiple sets

Executes the filter_dominated() function for every set in a dataset and returns back a dataset, preserving set

Examples

>>> dataset = eaf.read_datasets("./doc/examples/input1.dat")
>>> subset = eaf.subset(dataset, range = [3,5])
>>> eaf.filter_dominated_sets(subset)
array([[2.60764118, 6.31309852, 3.        ],
       [3.22509709, 6.1522834 , 3.        ],
       [0.37731545, 9.02211752, 3.        ],
       [4.61023932, 2.29231998, 3.        ],
       [0.2901393 , 8.32259412, 4.        ],
       [1.54506255, 0.38303122, 4.        ],
       [4.43498452, 4.13150648, 5.        ],
       [9.78758589, 1.41238277, 5.        ],
       [7.85344142, 3.02219054, 5.        ],
       [0.9017068 , 7.49376946, 5.        ],
       [0.17470556, 8.89066343, 5.        ]])

The above returns sets 3,4,5 with dominated points within each set removed.

See also

This

eafpy.eaf.get_diff_eaf(x, y, intervals=None, debug=False)[source]#

eafpy.eaf.get_eaf(data, percentiles=[], debug=False)[source]#

Empiracal attainment function (EAF) calculation

Calculate EAF in 2d or 3d from the input dataset

Parameters:

dataset (numpy array) – Numpy array of numerical values and set numbers, containing multiple sets. For example the output of the read_datasets() function
percentiles (list) – A list of percentiles to calculate. If empty, all possible percentiles are calculated. Note the maximum
debug (bool) – (For developers) print out debugging information in the C code

Returns:

Returns a numpy array containing the EAF data points, with the same number of columns as the input argument, but a different number of rows. The last column represents the EAF percentile for that data point

Return type:

numpy array

Examples

>>> dataset = eaf.read_datasets("./doc/examples/input1.dat")
>>> subset = eaf.subset(dataset, range = [7,10])
>>> eaf.get_eaf(subset)
array([[  0.62230271,   3.56945324,  25.        ],
       [  0.86723965,   1.58599089,  25.        ],
       [  6.43135537,   1.00153569,  25.        ],
       [  9.7398055 ,   0.36688707,  25.        ],
       [  0.6510164 ,   9.42381213,  50.        ],
       [  0.79293574,   6.46605414,  50.        ],
       [  1.30291449,   4.50417698,  50.        ],
       [  1.58498886,   2.87955367,  50.        ],
       [  7.04694467,   1.83484358,  50.        ],
       [  9.7398055 ,   1.00153569,  50.        ],
       [  0.99008784,   8.84691923,  75.        ],
       [  1.06855707,   6.7102429 ,  75.        ],
       [  3.34035397,   2.89377444,  75.        ],
       [  9.30137043,   2.14328532,  75.        ],
       [  9.7398055 ,   1.83484358,  75.        ],
       [  9.94332713,   1.50186503,  75.        ],
       [  1.06855707,   8.84691923, 100.        ],
       [  3.34035397,   6.7102429 , 100.        ],
       [  4.93663823,   6.20957074, 100.        ],
       [  7.92511295,   3.92669598, 100.        ]])

eafpy.eaf.hypervolume(data, ref)[source]#

Hypervolume indicator

Computes the hypervolume metric with respect to a given reference point assuming minimization of all objectives.

Parameters:

data (numpy.ndarray) – Numpy array of numerical values, where each row gives the coordinates of a point in objective space. If the array is created from the read_datasets() function, remove the last column
ref (numpy array or list) – Reference point set as a numpy array or list. Must be same length as a single point in the dataset

Returns:

A single numerical value, the hypervolume indicator

Return type:

float

Examples

>>> dat = np.array([[5,5],[4,6],[2,7], [7,4]])
>>> eaf.hypervolume(dat, ref = [10, 10])
38.0

Select Set 1 of dataset, and remove set number column >>> dat = eaf.read_datasets(“./doc/examples/input1.dat”) >>> set1 = dat[dat[:,2]==1, :2]

This set contains dominated points so remove them >>> set1 = eaf.filter_dominated(set1) >>> eaf.hypervolume(set1, ref= [10, 10]) 90.46272764755885

eafpy.eaf.igd(data, ref, maximise=False)[source]#

Inverted Generational Distance (IGD and IGD+) and Averaged Hausdorff Distance.

Functions to compute the inverted generational distance (IGD and IGD+) and the averaged Hausdorff distance between nondominated sets of points.

See the full documentation here: https://mlopez-ibanez.github.io/eaf/reference/igd.html

Parameters:

data (numpy.ndarray) – Numpy array of numerical values, where each row gives the coordinates of a point in objective space. If the array is created from the read_datasets() function, remove the last (set) column.
ref (numpy.ndarray or list) – Reference point set as a numpy array or list. Must have same number of columns as the dataset.
maximise (bool or or list of bool) – Whether the objectives must be maximised instead of minimised. Either a single boolean value that applies to all objectives or a list of booleans, with one value per objective. Also accepts a 1d numpy array with value 0/1 for each objective.
p (float, default 1) – Hausdorff distance parameter. Must be larger than 0.

Returns:

A single numerical value

Return type:

float

Examples

>>> dat =  np.array([[3.5,5.5], [3.6,4.1], [4.1,3.2], [5.5,1.5]])
>>> ref = np.array([[1, 6], [2,5], [3,4], [4,3], [5,2], [6,1]])
>>> eaf.igd(dat, ref = ref)
1.0627908666722465

>>> eaf.igd_plus(dat, ref = ref)
0.9855036468106652

>>> eaf.avg_hausdorff_dist(dat, ref)
1.0627908666722465

eafpy.eaf.igd_plus(data, ref, maximise=False)[source]#

Calculate IGD+ indicator

See igd()

eafpy.eaf.is_nondominated(data, maximise=False, keep_weakly=False)[source]#

Identify, and remove dominated points according to Pareto optimality.

Parameters:

data (numpy array) – Numpy array of numerical values, where each row gives the coordinates of a point in objective space. If the array is created from the read_datasets() function, remove the last column.
maximise (single bool, or list of booleans) – Whether the objectives must be maximised instead of minimised. Either a single boolean value that applies to all objectives or a list of boolean values, with one value per objective. Also accepts a 1d numpy array with value 0/1 for each objective
keep_weakly (bool) – If FALSE, return FALSE for any duplicates of nondominated points

Returns:

is_nondominated returns a boolean list of the same length as the number of rows of data, where TRUE means that the point is not dominated by any other point.

filter_dominated returns a numpy array with only mutually nondominated points.

Return type:

bool array

Examples

>>> S = np.array([[1,1], [0,1], [1,0], [1,0]])
>>> eaf.is_nondominated(S)
array([False,  True, False,  True])

>>> eaf.is_nondominated(S, maximise = True)
array([ True, False, False, False])

>>> eaf.filter_dominated(S)
array([[0, 1],
       [1, 0]])

>>> eaf.filter_dominated(S, keep_weakly = True)
array([[0, 1],
       [1, 0],
       [1, 0]])

eafpy.eaf.normalise(data, to_range=[0.0, 1.0], lower=nan, upper=nan, maximise=False)[source]#

Normalise points per coordinate to a range, e.g., to_range = [1,2], where the minimum value will correspond to 1 and the maximum to 2.

Parameters:

data (numpy.ndarray) – Numpy array of numerical values, where each row gives the coordinates of a point in objective space. See normalise_sets() to normalise data that includes set numbers (Multiple sets)
to_range (numpy array or list of 2 points) – Normalise values to this range. If the objective is maximised, it is normalised to (to_range[1], to_range[0]) instead.
upper (list or np array) – Bounds on the values. If np.nan, the maximum and minimum values of each coordinate are used.
lower (list or np array) – Bounds on the values. If np.nan, the maximum and minimum values of each coordinate are used.
maximise (single bool, or list of booleans) – Whether the objectives must be maximised instead of minimised. Either a single boolean value that applies to all objectives or a list of booleans, with one value per objective. Also accepts a 1D numpy array with values 0 or 1 for each objective

Returns:

Returns the data normalised as requested.

Return type:

numpy array

Examples

>>> dat = np.array([[3.5,5.5], [3.6,4.1], [4.1,3.2], [5.5,1.5]])
>>> eaf.normalise(dat)
array([[0.   , 1.   ],
       [0.05 , 0.65 ],
       [0.3  , 0.425],
       [1.   , 0.   ]])

>>> eaf.normalise(dat, to_range = [1,2], lower = [3.5, 3.5], upper = 5.5)
array([[1.  , 2.  ],
       [1.05, 1.3 ],
       [1.3 , 0.85],
       [2.  , 0.  ]])

See also

This

eafpy.eaf.normalise_sets(dataset, range=[0, 1], lower='na', upper='na', maximise=False)[source]#

Normalise dataset with multiple sets

Executes the normalise() function for every set in a dataset (Performs normalise on every set seperately)

Examples

>>> dataset = eaf.read_datasets("./doc/examples/input1.dat")
>>> subset = eaf.subset(dataset, range = [4,5])
>>> eaf.normalise_sets(subset)
array([[1.        , 0.38191742, 4.        ],
       [0.70069111, 0.5114669 , 4.        ],
       [0.12957487, 0.29411141, 4.        ],
       [0.28059067, 0.53580626, 4.        ],
       [0.32210885, 0.21797067, 4.        ],
       [0.39161668, 0.92106178, 4.        ],
       [0.        , 1.        , 4.        ],
       [0.62293227, 0.11315216, 4.        ],
       [0.76936124, 0.58159784, 4.        ],
       [0.12957384, 0.        , 4.        ],
       [0.82581672, 0.66566917, 5.        ],
       [0.44318444, 0.35888982, 5.        ],
       [0.80036477, 0.23242446, 5.        ],
       [0.88550836, 0.51482968, 5.        ],
       [0.89293026, 1.        , 5.        ],
       [1.        , 0.        , 5.        ],
       [0.79879657, 0.21247419, 5.        ],
       [0.07562783, 0.80266586, 5.        ],
       [0.        , 0.98703813, 5.        ],
       [0.6229605 , 0.8613516 , 5.        ]])

See also

This

eafpy.eaf.rand_non_dominated_sets(num_points, num_sets=10, shape=3, scale=1)[source]#

Create randomised non-dominated sets

Create a dataset of random non-dominated sets following a gamma distribution. This is slow for higher number of points (> 100)

Parameters:

num_points (integer) – Number of points in the resulting dataset
num_sets (integer) – Number of datapoints per set. There should be an equal number of points per set so num_points % num_sets should = 0
shape (float) – Shape and Scale parameters for the gamma distribution
scale (float) – Shape and Scale parameters for the gamma distribution

Returns:

An (n, 3) numpy array containing non dominated points and set numbers. The last column represents the set numbers

Return type:

np.ndarray (n, 3)

eafpy.eaf.read_datasets(filename)[source]#

Reads an input dataset file, parsing the file and returning a numpy array

Parameters:: filename (str) – Filename of the dataset file. Each row of the table appears as one line of the file. Datasets are separated by an empty line. If it does not contain an absolute path, the file name is relative to the current working directory. If the filename has extension ‘.xz’, it is decompressed to a temporary file before reading it.
Returns:: An array containing a representation of the data in the file. The first n-1 columns contain the numerical data for each of the objectives. The last column contains an identifier for which set the data is relevant to.
Return type:: numpy.ndarray

Examples

>>> eaf.read_datasets("./doc/examples/input1.dat") 
array([[ 8.07559653,  2.40702554,  1.        ],
       [ 8.66094446,  3.64050144,  1.        ],
       [ 0.20816431,  4.62275469,  1.        ],
       ...
       [ 4.92599726,  2.70492519, 10.        ],
       [ 1.22234394,  5.68950311, 10.        ],
       [ 7.99466959,  2.81122537, 10.        ],
       [ 2.12700289,  2.43114174, 10.        ]])

The numpy array represents this data:

Objective 1	Objective 2	Set Number
8.07559653	2.40702554	1.0
8.66094446	3.64050144	1.0
etc.	etc.	etc.

eafpy.eaf.subset(dataset, set=-2, range=[])[source]#

Subset is a convenience function for extracting a set or range of sets from a larger dataset. It takes a dataset with multiple set numbers, and returns 1 or more sets (with their set numbers)

Use the data_subset() to choose a single set and use set numbers.

Parameters:

dataset (numpy array) – Numpy array of numerical values and set numbers, containing multiple sets. For example the output of the read_datasets() function
set (integer) – Select a single set from the dataset, where the selected set is equal to this argument
range (list (length 2)) – Select sets from the dataset with an inequality. range[0] <= Set_num <= Range[1]

Returns:

returns back a numpy array with the same columns as the input, with certain datasets selected

Return type:

numpy array

Examples

>>> dataset = read_datasets("./doc/examples/input1.dat")
>>> subset(dataset, set = 1)
array([[8.07559653, 2.40702554, 1.        ],
       [8.66094446, 3.64050144, 1.        ],
       [0.20816431, 4.62275469, 1.        ],
       [4.8814328 , 9.09473137, 1.        ],
       [0.22997367, 1.11772205, 1.        ],
       [1.51643636, 3.07933731, 1.        ],
       [6.08152841, 4.58743853, 1.        ],
       [2.3530968 , 0.79055172, 1.        ],
       [8.7475454 , 1.71575862, 1.        ],
       [0.58799475, 0.73891181, 1.        ]])
>>> subset(dataset, range =[4, 6])
array([[9.9751443 , 3.41528862, 4.        ],
       [7.07633622, 4.44385483, 4.        ],
       [1.54507257, 2.71814725, 4.        ],
       [3.00766139, 4.63709876, 4.        ],
       [3.40976512, 2.1136231 , 4.        ],
       [4.08294878, 7.69585918, 4.        ],
       [0.2901393 , 8.32259412, 4.        ],
       [6.32324143, 1.28140989, 4.        ],
       [7.74140672, 5.00066389, 4.        ],
       [1.54506255, 0.38303122, 4.        ],
       [8.11318284, 6.45581597, 5.        ],
       [4.43498452, 4.13150648, 5.        ],
       [7.86851636, 3.17334347, 5.        ],
       [8.68699143, 5.3129827 , 5.        ],
       [8.75833731, 8.98886885, 5.        ],
       [9.78758589, 1.41238277, 5.        ],
       [7.85344142, 3.02219054, 5.        ],
       [0.9017068 , 7.49376946, 5.        ],
       [0.17470556, 8.89066343, 5.        ],
       [6.1631503 , 7.93840121, 5.        ],
       [4.10476852, 9.67891782, 6.        ],
       [8.57911868, 0.35169752, 6.        ],
       [4.96525837, 1.94353305, 6.        ],
       [8.17231096, 9.76977853, 6.        ],
       [6.78498493, 0.56380796, 6.        ],
       [2.71891214, 6.94327481, 6.        ],
       [3.4186965 , 9.38437467, 6.        ],
       [6.45431955, 4.06044388, 6.        ],
       [1.13096306, 9.72645436, 6.        ],
       [8.34008115, 5.70698919, 6.        ]])

See also

data_subset()