eafpy.eaf module#
- exception eafpy.eaf.ReadDatasetsError(error_code)[source]#
Bases:
Exception
Custom exception class for an error returned by the read_datasets function
- Parameters:
error_code (
int
) – Error code returned by read_datasets C function, which maps to a string from
- eafpy.eaf.avg_hausdorff_dist(data, ref, maximise=False, p=1)[source]#
Calculate average Hausdorff distance
See
igd()
- eafpy.eaf.data_subset(dataset, set)[source]#
Select data points from a specific dataset. Returns a single set, without the set number column
This can be used to parse data for inputting to functions such as
igd()
andhypervolume()
.Similar to the
subset()
function, but can only return 1 set and removes the last column (set number)- Parameters:
dataset (
numpy array
) – Numpy array of numerical values and set numbers, containing multiple sets. For example the output of theread_datasets()
functionSet (
integer
) – Select a single set from the dataset, where the selected set is equal to this argument
- Returns:
returns back a single set with only the objective data. (set numbers are excluded)
- Return type:
numpy array
Examples
>>> dataset = eaf.read_datasets("./doc/examples/input1.dat") >>> data1 = eaf.data_subset(dataset, set = 1)
The above selects dataset 1 and removes the set number so it can be used as an input to functions such as
hypervolume()
>>> eaf.hypervolume(data1, [10, 10]) 90.46272764755885See also
- eafpy.eaf.epsilon_additive(data, ref, maximise=False)[source]#
Computes the epsilon metric, either additive or multiplicative.
data and reference must all be larger than 0 for epsilon_mult.
- Parameters:
data (
numpy.ndarray
) – Numpy array of numerical values, where each row gives the coordinates of a point in objective space. If the array is created from theread_datasets()
function, remove the last (set) columnref (
numpy.ndarray
orlist
) – Reference point set as a numpy array or list. Must have same number of columns as a single point in the datasetmaximise (
bool
orlist
ofbool
) – Whether the objectives must be maximised instead of minimised. Either a single boolean value that applies to all objectives or a list of booleans, with one value per objective. Also accepts a 1d numpy array with value 0/1 for each objective
- Returns:
A single numerical value
- Return type:
float
Examples
>>> dat = np.array([[3.5,5.5], [3.6,4.1], [4.1,3.2], [5.5,1.5]]) >>> ref = np.array([[1, 6], [2,5], [3,4], [4,3], [5,2], [6,1]]) >>> eaf.epsilon_additive(dat, ref = ref) 2.5
>>> eaf.epsilon_mult(dat, ref = ref) 3.5
- eafpy.eaf.filter_dominated(data, maximise=False, keep_weakly=False)[source]#
Remove dominated points according to Pareto optimality. See:
is_nondominated()
for details
- eafpy.eaf.filter_dominated_sets(dataset, maximise=False, keep_weakly=False)[source]#
Filter dominated sets for multiple sets
Executes the
filter_dominated()
function for every set in a dataset and returns back a dataset, preserving setExamples
>>> dataset = eaf.read_datasets("./doc/examples/input1.dat") >>> subset = eaf.subset(dataset, range = [3,5]) >>> eaf.filter_dominated_sets(subset) array([[2.60764118, 6.31309852, 3. ], [3.22509709, 6.1522834 , 3. ], [0.37731545, 9.02211752, 3. ], [4.61023932, 2.29231998, 3. ], [0.2901393 , 8.32259412, 4. ], [1.54506255, 0.38303122, 4. ], [4.43498452, 4.13150648, 5. ], [9.78758589, 1.41238277, 5. ], [7.85344142, 3.02219054, 5. ], [0.9017068 , 7.49376946, 5. ], [0.17470556, 8.89066343, 5. ]])
The above returns sets 3,4,5 with dominated points within each set removed.
See also
This
- eafpy.eaf.get_eaf(data, percentiles=[], debug=False)[source]#
Empiracal attainment function (EAF) calculation
Calculate EAF in 2d or 3d from the input dataset
- Parameters:
dataset (
numpy array
) – Numpy array of numerical values and set numbers, containing multiple sets. For example the output of theread_datasets()
functionpercentiles (
list
) – A list of percentiles to calculate. If empty, all possible percentiles are calculated. Note the maximumdebug (
bool
) – (For developers) print out debugging information in the C code
- Returns:
Returns a numpy array containing the EAF data points, with the same number of columns as the input argument, but a different number of rows. The last column represents the EAF percentile for that data point
- Return type:
numpy array
Examples
>>> dataset = eaf.read_datasets("./doc/examples/input1.dat") >>> subset = eaf.subset(dataset, range = [7,10]) >>> eaf.get_eaf(subset) array([[ 0.62230271, 3.56945324, 25. ], [ 0.86723965, 1.58599089, 25. ], [ 6.43135537, 1.00153569, 25. ], [ 9.7398055 , 0.36688707, 25. ], [ 0.6510164 , 9.42381213, 50. ], [ 0.79293574, 6.46605414, 50. ], [ 1.30291449, 4.50417698, 50. ], [ 1.58498886, 2.87955367, 50. ], [ 7.04694467, 1.83484358, 50. ], [ 9.7398055 , 1.00153569, 50. ], [ 0.99008784, 8.84691923, 75. ], [ 1.06855707, 6.7102429 , 75. ], [ 3.34035397, 2.89377444, 75. ], [ 9.30137043, 2.14328532, 75. ], [ 9.7398055 , 1.83484358, 75. ], [ 9.94332713, 1.50186503, 75. ], [ 1.06855707, 8.84691923, 100. ], [ 3.34035397, 6.7102429 , 100. ], [ 4.93663823, 6.20957074, 100. ], [ 7.92511295, 3.92669598, 100. ]])
- eafpy.eaf.hypervolume(data, ref)[source]#
Hypervolume indicator
Computes the hypervolume metric with respect to a given reference point assuming minimization of all objectives.
- Parameters:
data (
numpy.ndarray
) – Numpy array of numerical values, where each row gives the coordinates of a point in objective space. If the array is created from the read_datasets() function, remove the last columnref (
numpy array
orlist
) – Reference point set as a numpy array or list. Must be same length as a single point in the dataset
- Returns:
A single numerical value, the hypervolume indicator
- Return type:
float
Examples
>>> dat = np.array([[5,5],[4,6],[2,7], [7,4]]) >>> eaf.hypervolume(dat, ref = [10, 10]) 38.0
Select Set 1 of dataset, and remove set number column >>> dat = eaf.read_datasets(“./doc/examples/input1.dat”) >>> set1 = dat[dat[:,2]==1, :2]
This set contains dominated points so remove them >>> set1 = eaf.filter_dominated(set1) >>> eaf.hypervolume(set1, ref= [10, 10]) 90.46272764755885
- eafpy.eaf.igd(data, ref, maximise=False)[source]#
Inverted Generational Distance (IGD and IGD+) and Averaged Hausdorff Distance.
Functions to compute the inverted generational distance (IGD and IGD+) and the averaged Hausdorff distance between nondominated sets of points.
See the full documentation here: https://mlopez-ibanez.github.io/eaf/reference/igd.html
- Parameters:
data (
numpy.ndarray
) – Numpy array of numerical values, where each row gives the coordinates of a point in objective space. If the array is created from theread_datasets()
function, remove the last (set) column.ref (
numpy.ndarray
orlist
) – Reference point set as a numpy array or list. Must have same number of columns as the dataset.maximise (
bool
oror list
ofbool
) – Whether the objectives must be maximised instead of minimised. Either a single boolean value that applies to all objectives or a list of booleans, with one value per objective. Also accepts a 1d numpy array with value 0/1 for each objective.p (
float
, default1
) – Hausdorff distance parameter. Must be larger than 0.
- Returns:
A single numerical value
- Return type:
float
Examples
>>> dat = np.array([[3.5,5.5], [3.6,4.1], [4.1,3.2], [5.5,1.5]]) >>> ref = np.array([[1, 6], [2,5], [3,4], [4,3], [5,2], [6,1]]) >>> eaf.igd(dat, ref = ref) 1.0627908666722465
>>> eaf.igd_plus(dat, ref = ref) 0.9855036468106652
>>> eaf.avg_hausdorff_dist(dat, ref) 1.0627908666722465
- eafpy.eaf.is_nondominated(data, maximise=False, keep_weakly=False)[source]#
Identify, and remove dominated points according to Pareto optimality.
- Parameters:
data (
numpy array
) – Numpy array of numerical values, where each row gives the coordinates of a point in objective space. If the array is created from the read_datasets() function, remove the last column.maximise (
single bool
, orlist
ofbooleans
) – Whether the objectives must be maximised instead of minimised. Either a single boolean value that applies to all objectives or a list of boolean values, with one value per objective. Also accepts a 1d numpy array with value 0/1 for each objectivekeep_weakly (
bool
) – If FALSE, return FALSE for any duplicates of nondominated points
- Returns:
is_nondominated returns a boolean list of the same length as the number of rows of data, where TRUE means that the point is not dominated by any other point.
filter_dominated returns a numpy array with only mutually nondominated points.
- Return type:
bool array
Examples
>>> S = np.array([[1,1], [0,1], [1,0], [1,0]]) >>> eaf.is_nondominated(S) array([False, True, False, True])
>>> eaf.is_nondominated(S, maximise = True) array([ True, False, False, False])
>>> eaf.filter_dominated(S) array([[0, 1], [1, 0]])
>>> eaf.filter_dominated(S, keep_weakly = True) array([[0, 1], [1, 0], [1, 0]])
- eafpy.eaf.normalise(data, to_range=[0.0, 1.0], lower=nan, upper=nan, maximise=False)[source]#
Normalise points per coordinate to a range, e.g., to_range = [1,2], where the minimum value will correspond to 1 and the maximum to 2.
- Parameters:
data (
numpy.ndarray
) – Numpy array of numerical values, where each row gives the coordinates of a point in objective space. Seenormalise_sets()
to normalise data that includes set numbers (Multiple sets)to_range (
numpy array
orlist
of2 points
) – Normalise values to this range. If the objective is maximised, it is normalised to (to_range[1], to_range[0]) instead.upper (
list
ornp array
) – Bounds on the values. If np.nan, the maximum and minimum values of each coordinate are used.lower (
list
ornp array
) – Bounds on the values. If np.nan, the maximum and minimum values of each coordinate are used.maximise (
single bool
, orlist
ofbooleans
) – Whether the objectives must be maximised instead of minimised. Either a single boolean value that applies to all objectives or a list of booleans, with one value per objective. Also accepts a 1D numpy array with values 0 or 1 for each objective
- Returns:
Returns the data normalised as requested.
- Return type:
numpy array
Examples
>>> dat = np.array([[3.5,5.5], [3.6,4.1], [4.1,3.2], [5.5,1.5]]) >>> eaf.normalise(dat) array([[0. , 1. ], [0.05 , 0.65 ], [0.3 , 0.425], [1. , 0. ]])
>>> eaf.normalise(dat, to_range = [1,2], lower = [3.5, 3.5], upper = 5.5) array([[1. , 2. ], [1.05, 1.3 ], [1.3 , 0.85], [2. , 0. ]])
See also
This
- eafpy.eaf.normalise_sets(dataset, range=[0, 1], lower='na', upper='na', maximise=False)[source]#
Normalise dataset with multiple sets
Executes the
normalise()
function for every set in a dataset (Performs normalise on every set seperately)Examples
>>> dataset = eaf.read_datasets("./doc/examples/input1.dat") >>> subset = eaf.subset(dataset, range = [4,5]) >>> eaf.normalise_sets(subset) array([[1. , 0.38191742, 4. ], [0.70069111, 0.5114669 , 4. ], [0.12957487, 0.29411141, 4. ], [0.28059067, 0.53580626, 4. ], [0.32210885, 0.21797067, 4. ], [0.39161668, 0.92106178, 4. ], [0. , 1. , 4. ], [0.62293227, 0.11315216, 4. ], [0.76936124, 0.58159784, 4. ], [0.12957384, 0. , 4. ], [0.82581672, 0.66566917, 5. ], [0.44318444, 0.35888982, 5. ], [0.80036477, 0.23242446, 5. ], [0.88550836, 0.51482968, 5. ], [0.89293026, 1. , 5. ], [1. , 0. , 5. ], [0.79879657, 0.21247419, 5. ], [0.07562783, 0.80266586, 5. ], [0. , 0.98703813, 5. ], [0.6229605 , 0.8613516 , 5. ]])
See also
This
- eafpy.eaf.rand_non_dominated_sets(num_points, num_sets=10, shape=3, scale=1)[source]#
Create randomised non-dominated sets
Create a dataset of random non-dominated sets following a gamma distribution. This is slow for higher number of points (> 100)
- Parameters:
num_points (
integer
) – Number of points in the resulting datasetnum_sets (
integer
) – Number of datapoints per set. There should be an equal number of points per set so num_points % num_sets should = 0shape (
float
) – Shape and Scale parameters for the gamma distributionscale (
float
) – Shape and Scale parameters for the gamma distribution
- Returns:
An (n, 3) numpy array containing non dominated points and set numbers. The last column represents the set numbers
- Return type:
np.ndarray (n
,3)
- eafpy.eaf.read_datasets(filename)[source]#
Reads an input dataset file, parsing the file and returning a numpy array
- Parameters:
filename (
str
) – Filename of the dataset file. Each row of the table appears as one line of the file. Datasets are separated by an empty line. If it does not contain an absolute path, the file name is relative to the current working directory. If the filename has extension ‘.xz’, it is decompressed to a temporary file before reading it.- Returns:
An array containing a representation of the data in the file. The first n-1 columns contain the numerical data for each of the objectives. The last column contains an identifier for which set the data is relevant to.
- Return type:
numpy.ndarray
Examples
>>> eaf.read_datasets("./doc/examples/input1.dat") array([[ 8.07559653, 2.40702554, 1. ], [ 8.66094446, 3.64050144, 1. ], [ 0.20816431, 4.62275469, 1. ], ... [ 4.92599726, 2.70492519, 10. ], [ 1.22234394, 5.68950311, 10. ], [ 7.99466959, 2.81122537, 10. ], [ 2.12700289, 2.43114174, 10. ]])
The numpy array represents this data:
Objective 1
Objective 2
Set Number
8.07559653
2.40702554
1.0
8.66094446
3.64050144
1.0
etc.
etc.
etc.
- eafpy.eaf.subset(dataset, set=-2, range=[])[source]#
Subset is a convenience function for extracting a set or range of sets from a larger dataset. It takes a dataset with multiple set numbers, and returns 1 or more sets (with their set numbers)
Use the
data_subset()
to choose a single set and use set numbers.- Parameters:
dataset (
numpy array
) – Numpy array of numerical values and set numbers, containing multiple sets. For example the output of theread_datasets()
functionset (
integer
) – Select a single set from the dataset, where the selected set is equal to this argumentrange (
list (length 2)
) – Select sets from the dataset with an inequality. range[0] <= Set_num <= Range[1]
- Returns:
returns back a numpy array with the same columns as the input, with certain datasets selected
- Return type:
numpy array
Examples
>>> dataset = read_datasets("./doc/examples/input1.dat") >>> subset(dataset, set = 1) array([[8.07559653, 2.40702554, 1. ], [8.66094446, 3.64050144, 1. ], [0.20816431, 4.62275469, 1. ], [4.8814328 , 9.09473137, 1. ], [0.22997367, 1.11772205, 1. ], [1.51643636, 3.07933731, 1. ], [6.08152841, 4.58743853, 1. ], [2.3530968 , 0.79055172, 1. ], [8.7475454 , 1.71575862, 1. ], [0.58799475, 0.73891181, 1. ]]) >>> subset(dataset, range =[4, 6]) array([[9.9751443 , 3.41528862, 4. ], [7.07633622, 4.44385483, 4. ], [1.54507257, 2.71814725, 4. ], [3.00766139, 4.63709876, 4. ], [3.40976512, 2.1136231 , 4. ], [4.08294878, 7.69585918, 4. ], [0.2901393 , 8.32259412, 4. ], [6.32324143, 1.28140989, 4. ], [7.74140672, 5.00066389, 4. ], [1.54506255, 0.38303122, 4. ], [8.11318284, 6.45581597, 5. ], [4.43498452, 4.13150648, 5. ], [7.86851636, 3.17334347, 5. ], [8.68699143, 5.3129827 , 5. ], [8.75833731, 8.98886885, 5. ], [9.78758589, 1.41238277, 5. ], [7.85344142, 3.02219054, 5. ], [0.9017068 , 7.49376946, 5. ], [0.17470556, 8.89066343, 5. ], [6.1631503 , 7.93840121, 5. ], [4.10476852, 9.67891782, 6. ], [8.57911868, 0.35169752, 6. ], [4.96525837, 1.94353305, 6. ], [8.17231096, 9.76977853, 6. ], [6.78498493, 0.56380796, 6. ], [2.71891214, 6.94327481, 6. ], [3.4186965 , 9.38437467, 6. ], [6.45431955, 4.06044388, 6. ], [1.13096306, 9.72645436, 6. ], [8.34008115, 5.70698919, 6. ]])
See also