####################################### # SecSel, a software tool for conservation prioritization # # 2020-12-10 ####################################### Name of software: SecSel Software access: http://www.nies.go.jp/biology/en/data/tool/secsel/index.html Developer: TAKENAKA, Akio Year first available 2020 Source language: Python 3.X Availability and cost: open source Released under the MIT license https://licenses.opensource.jp/ [About this document] This document is an introduction to SecSel, a Python package that supports the design of protected areas for biodiversity conservation, and explains in detail what is not mentioned in readme_1st_en.txt. It is best to read this document after reading readme_1st_en.txt first. This document explains the followings - Setting conservation goals for each feature - Consideration of costs in selecting sites. - Site selection with as little scattered as possible - Resolution of conflicts of features that are difficult to preserve and utilize in the same site - Exporting detailed information to the output file - Other settings - How to work with SecSel in a Python program test_dataset.zip contains an example parameter file (parameters_full.txt) for all the configurable parameters, including the ones described in this document, as well as examples of the various data files set therein. The virtual data included in the test data zip file is generated using the included gen_test_data_for_secsel.py file. The evaluation data file foo_local_units.txt, described in readme_1st_en.txt, is also generated by the script. Since this script probabilistically determines the distribution of each feature, you will get different data than the attached one even if the data file is generated with the same settings. You can also generate data with different settings by editing the code accordingly. In parameters_full.txt, you can test the lines that have been commented out (commented by adding # at the beginning of the line to disable the parameter settings) by deleting # to enable them, or by changing the values of the parameters. [Setting targets for individual features] The conservation goal (how many high-value local units to include in a protected area) can be set by default_top_n to a default value common to all features, or a value for each individual feature can be set in the file specified by the parameter top_n_file. Each line of the file contains a tab-delimited list of a feature name and the conservation goal. The first line is the header line. For features that are not set targets in this file, the targets are set to the value of default_top_n. Sample file: foo_target.txt [Consideration of cost in site selection] Considering the costs incurred by adding a site to a protected area, SecSel can select less costly sets of sites to be protected. Various costs are possible, such as cost of land acquisition, the cost of management after the establishment of the protected area. Another cost is the reduction in the use of ecosystem services and other land uses (reduction in agricultural production and fisheries) as a result of making the area a protected area. Consideration of costs does not prevent the achievement of conservation targets. When selecting additional sites to be added to achieve conservation targets, those with relatively low costs will be selected. The cost is given in a file with the site name and cost per line, separated by tabs. The first line is the header line. In the parameter file, the name of this file needs to be specified as site_cost_file. Sample file: foo_cost.txt In this example, the larger the y-coordinate of the site (the last three digits of the site name), the greater the cost. The cost is irrelevant to the x-coordinates (the first three digits of the site name). [Selecting sites that are not scattered] In actual protected area design, it is often more efficient for management if the protected areas are as less scattered as possible. When designing a protected area, SecSel can consider some measure of scatteredness (i.e., scatter penalty) and select sets of sites with smaller scatter penalty. As in the case of costs, consideration of scatter penalties does not prevent the achievement of the conservation target. In selecting the sites to be added to achieve the conservation target, those with relatively small scattered penalties are selected. There are many possible ways to calculate such a penalty. Currently, SecSel implements two methods. One focuses on the total perimeter (BOUNDARY_LEN). The more neighboring sites sharing their borders are there, the smaller the total boundary-length of the whole protected area (= penalty) becomes. The other method is the distance between the site to be added and the nearest already selected site (DISTANCE_TO_NEAREST). The higher this value, the further away the added site will be, and the greater the penalty. The merit of the method is that it applies to discontinuous sites. Examples of such discontinuous cases are ponds and fragmented vegetation. To use these methods, you specify BOUNDARY_LEN or ISTANCE_TO_NEAREST in the parameter file for the scatter_penalty_algorithm. Also, set the file name of the data file as scatter_penalty_file, which is needed to calculate the degree of scatteredness. The content of the file you specify here depends on the method of penalty calculation. In the case of BOUNDARY_LEN (perimeter length) Each line is tab-delimited with the names of two sites and the length of the boundary shared between the two sites. The first line is the header line. Sample file: foo_border.txt In this sample file, the first three digits of the site name are x-coordinates and the next three digits are y-coordinates. A site is rectangle. The pairs of neighboring sites that share one side of the quadrangle are listed, and the length of one side is assumed to be 1. In the case of DISTANCE_TO_NEAREST Each line contains the site name, x-coordinate, and y-coordinate, delimited by tabs. The first line is the header line. Sample file: foo_xy.txt As with foo_border.txt, the first three digits of the site name are the x-coordinates and the next three digits are the y-coordinates. [Resolution of conflicts among features] If there are features that cannot be protected and used at the same time within a single site, we call these are in a conflict relationship. Possible examples are the conservation of woodland organisms and use as farmland, or the conservation of plants that prefer oligotrophic ponds and plants that prefer nutrient-poor ponds. If local units of conflicting features are within the same site, SecSel decides which unit of the features prioritizes over the others before starting site selection. This decision depends on which local unit has a higher potential contribution to achieving the conservation target of the feature. The features with higher target values require more units to achieve the goal. Therefore, even if the units have the same rank, the contribution of the unit will be greater for features with higher target values. If there are many units of the same value for a feature, there are more substitute units and thus the potential contribution of the unit is smaller. If you set the name of the file containing the conflict relationships between the features as conflict_file in the parameter file, SecSel will read it and resolve the conflict relationship. Two consecutive lines are one set of conflicting features. There may be any number of pairs in the conflict file. Each line has a tab-delimited list of features names. There is a conflict relationship between the features in the first line and those in the second line of each pair. For example, the first line contains forest species, the second line contains grassland species, and so on. There can be any number of conflict pairs: forest versus grassland con?ict, conservation versus use con?ict, etc. Blank lines are ignored; a blank line between a two-line pair and the next two-line pair makes it easier for a human to read. There is no header line. Sample file: foo_conflicts.txt [Output detailed information] The output of the results is controlled by two parameters specified in the parameter file. If you make multiple runs (n_run has a value greater than or equal to 2) and set save_each_run to 0, the result of each run is not output to files, but only the summary result is output. If you set no value to save_each_run, Secsel considers it as 1, and the result of each run is output to a separate file. The content of the output files is specified by result_file_format. This parameter can have one of three values; 'SITE', 'SITE_COST', or 'FULL'. - A result file for each run The name of a result file for each run consists of the string you set in output_file_base (XXX) and the serial number of the trial (NNN), i.e., "XXX_NNN_PA.txt". The file begins with a header line, followed by lines containing information for each site per line, in the order of selection. Each line of the output file contains the name of the selected sites first, then the order in which the site was selected. The selection order starts from 1, but if there are any pre-conserved sites, their order is labeled as 0. The format is common regardless if you set any of the above three values ('SITE', 'SITE_COST', or 'FULL') to result_file_format. If the format is SITE_COST or FULL, the accumulated cost to the site is output following the above two items. If there is no cost data, it will be 0. If the format is FULL, SecSel will further output the information of the features for which the site contributed to achieving the target. This contribution information is output as 0 (not contributing) or 1 (contributing) for each of all features. SecSel selects sites sequentially, evaluating which sites to add to the protected area based on the contribution of each site for achieving targets for individual features. Thus, for each site, it is clear for which of the features the site has been selected to achieve the target. The FULL-formatted output file further lists the conservation/use-value of each local unit of each feature in the site. The values are basically unchanged from the input file, but if there are conflicts between the features, the values will reflect the result of conflict resolution by SecSel. The value will be rendered as -99, if SecSel decided not to protect the feature in the site, as a result of the conflict resolution. - summary file The name of the summary file is the string set in output_file_base with "_sum.txt" appended to it. This file contains the total times of selection for each site in multiple runs. Regardless of the setting of result_file_format, the file has a header line followed by information of one site per line, in descending order of selection times. Each line contains the site name followed by the times of selection for that site. If result_file_format is FULL, then the number of times the site was selected to achieve the target of each feature is also printed. [Other settings in parameter file] In the parameter file, the following parameters can be set in addition. - pre_reserved_sites_file The name of a file that records information about the site to be pre-reserved in the protected area, with one site name written on each line of the text file, and the first line being the header line. Sample file: foo_pre_reserved.txt - max_upset When selecting a site, you can give preference to sites with low cost/scatter-penalty rather than maximum contribution to achieving the conservation targets. In this case, you can set max_upset to determine how lower ranks the conservation effect of sites can be included in the candidates for the selection. max_upset is only valid if at least one of the cost-consideration and the scatter-penalty is enabled. The larger the value, the more emphasis is placed on cost and scattered penalties to design a protected area. If max_upset is 0, there will be no upsets (default setting). - consider_cost_first If you set the parameter to 1, SecSel selects sites preferring a lower cost over smaller scatter-penalty. The default (0) gives preference to scatter penalties. This parameter is only valid if you set both the cost data and the method of calculating the scatter-penalty. [How to use SecSel in a script] - import and execution of SecSel in a Python script After the installation of secsel, you can import secsel in a Python script. secsel is the name of the module and SecSel is the name of the class defined in the secsel module. If you write SecSel() in a Python script, an object of the SecSel class will be created. To import secsel, just write the following at the beginning of the Python script (program). from secsel import SecSel In the subsequent script, for example, you write the following, s = SecSel() Then an object of SecSel class will be created and can be accessed by the variable named s. See run_secsel_example.py for an example script used in the test run. For uses, available methods (functions) of the SecSel object are as follows. # Methods to use before executing run() load_parameter_file(self, file_name) The method reads the parameter file with the specified name. set_parameter(self, parameter_name, parameter_value) It is used to set the value of a parameter with the specified name. This method overrides the value if that parameter already has a value. # Methods to perform site selection run(self) This method performs site selection according to the parameter settings and data loaded from the input files. The selection results are output to files in the format according to the configuration. You can call run() only once per SecSel object. If you try to run() a second time for an object, it results in an error. # Methods to use after run() get_list_of_lists_of_selected_sites(self) This function returns a nested list of selected site names, with the sites selected in a single run in each of the inner list. The return value is a bundled list of these lists, the length of which is equal to the number of runs. get_list_of_costs(self) It returns a list of the total costs of the selected sites in each run, whose length is the number of runs. # Other methods get_parameter(self, parameter_name) It returns the value of the parameter at the time the method is called. It is used to get the value of a parameter loaded from a parameter file in a script. get_file_name_for_a_run(self, i_run) It returns the name of the file to store the results of the specified run. get_summary_file_name(self) Returns the name of the file in which to store the summary of the results of multiple runs. - Dynamically setting parameters in a script Any parameter value that you can set in the parameter file can be configured or changed using set_parameter(). For example, you can automatically change the target (default_top_n) to several different values while keeping the data to be read in the same, and so on. This is convenient because there is no need to prepare a parameter file for each setting. For an example code, see run_secsel_multi_settings.py. - Analyzing selection results in a script You can extract the selection result from the SecSel object by get_list_of_lists_of_selected_sites() and get_list_of_costs(). However, the methods can only retrieve the names of selected sites and the total cost. To obtain detailed information about selected sites, such as for which features the site has been selected for protection, you need to read the output file created by setting result_file_format to FULL. The names of the output files are obtained using get_file_name_for_a_run() and get_summary_file_name().