Global Core Biodata Resource Selection 2022

Frequently Asked Questions

This web page is the FAQ for the process to select a set of Global Core Biodata Resources run by the Global Biodata Coalition in 2022. It is preserved here for the historical record.

 

The updated FAQ for the 2023 GCBR round of selection is here.

This Frequently Asked Questions page is designed to assist applicants to the Global Core Biodata Resource (GCBR) selection process by answering questions that frequently arise. It is a companion to the primary reference: Global Core Biodata Resources: Concept and Selection Process. The FAQ will be updated from time to time in light of input from the biodata resource community.

General Questions

What is the Global Biodata Coalition?

The Global Biodata Coalition (GBC) is supported by research funders and aims:

  • To be a forum for funders of biodata resources to better coordinate and share approaches for efficient management and growth of this infrastructure and share strategies.
  • To stabilize and ensure sustainable financial support for the global biodata infrastructure, with a focus on an identified and prioritized set of Global Core Biodata Resources that are crucial for sustaining the broader infrastructure.
What is the GBC’s definition of a biodata resource?
In this context a biodata resource is any biological, life science, or biomedical database that archives research data generated by scientists, or functions as a knowledgebase by adding value to scientific data by aggregation, processing, and expert curation. Although such biodata resources may include within them tools for searching or assembling reports from their content, their primary function is as a database, as opposed to an analysis tool. This distinguishes them from software tools or analysis platforms that serve to analyse or process user-supplied data.
What are the key criteria being used to select Global Core Biodata Resources?

The selection of GCBRs is based on multiple quantitative and qualitative indicators that fall into five categories:

Scientific focus and quality of science
Community served by the resource
Quality of service
Funding, governance and legal infrastructure
Impact stories

Where can I find the complete list of indicators that will be used for GCBR selection?

The full list of indicators is shown in a table in the Appendix of the primary publication “Global Core Biodata Resources: Concept and Selection Process” available at  https://doi.org/10.5281/zenodo.5845116.

Why does the application process have an Expression of Interest step?

The expression of interest questions, in particular the eligibility criteria, have been designed to help the applicants gauge for themselves whether it is worth committing the significant staff effort required for a full application. Assembling the data required to give an account of a biodata resource across the complete set of indicators used to select GCBRs will be time-consuming for the biodata resource personnel lodging the full application. Correspondingly, reviewing the full applications across all the indicators will be time-consuming for the Review Committee. The GBC does not wish to waste the time of either biodata resource personnel or the Review Committee, so will use the Expression of Interest step to screen the applications for likelihood that they could fulfil the criteria for a core biodata resource. Only those that show characteristics in line with GCBR expectations will be invited to submit a full application.

Who will have access to the data I supply in my Expression of Interest and Full Application?
The GBC Secretariat and the Review Committee will have access to the Expressions of Interest and Full Applications received from applicants. All reviewers will be asked to sign a confidentiality declaration as they join the review committee, in recognition of the sensitivity of some of the data requested as part of the application process.
Who will review the applications?
Both the Expressions of Interest and the Full Applications will be reviewed by members of the GBC Scientific Advisory Committee, with additional experts in biodata infrastructure provision brought in where necessary to provide the necessary standard of critical review across all subject domains represented in the applications. The names and affiliations of members of the review committee will be published after the review process is concluded.
Will only one biodata resource be selected for each scientific domain/subject area?

No. Each application will be evaluated on the basis of potential for attaining GCBR status (Expression of Interest stage) and excellence in terms of the indicators (Full Application stage), without reference to other applications. Thus it is possible that more than one biodata resource for a given scientific domain or subject area could be represented on the initial list of GCBRs and, conversely, that other scientific domains be unrepresented.

What if I would like to submit an application for my biodata resource but will be unable to meet the deadline?

Late applications will not be accepted.  However, this is the first round of Global Core Biodata Resource selection and there will be other opportunities to apply in the future. It is anticipated that future selection rounds will be run every two or three years.

Will the full list of applicant biodata resources be made public?

No, only the list of those biodata resources selected as GCBRs will be published.

What if the primary language of my biodata resources is not English?
The application form allows for biodata resources presented in any language to apply – though for the initial round of selection it is necessary that an English language version is freely available. Please see “Why is the call for applications limited to biodata resources made available to users in the English language?” in the next section.
Will successful applicants be eligible for new funding?
The GBC is not itself a funding agency, and thus is not in a position to provide direct support to the selected Global Core Biodata Resources. It is however a stated aim of the GBC to stabilize and ensure sustainable mechanisms for financial support for the global biodata infrastructure, focusing on the identified and prioritized set of Global Core Biodata Resources. The definition of that GCBR set is necessarily the starting point. The rigour of this process will signal to funders that the selected biodata resources have been through robust review which ascertains their utility to the whole community. This will facilitate the development of rational long term financing of the biodata resource infrastructure.
What is the benefit of entering the GCBR selection process?

The selected GCBRs will benefit in several ways:

  • GCBR status will provide confidence for researchers selecting repositories to archive their primary data, for example to comply with funders’ and publishers’ open data requirements.
  • Being included in the recognised list of GCBRs means that funding agencies and science publishers will recommend use of those biodata resources to their grantees and authors.
  • For managers of developing data resources, databases that have been identified as GCBRs will provide examples of good practice, fostering collaboration and interoperability.
  • For managers of databases defined as GCBRs, the GCBR community will provide a forum for sharing expertise, driving collaboration, and exploring potential solutions to the challenge posed by their precarious funding.
  • Open data fires contemporary biological and applied research and allows researchers to access and reuse data, driving discovery. Working toward GCBR status will inspire biodata resources to implement more permissive open data licenses so that they more fully reflect the FAIR principles, to the benefit of everybody.

Application Questions

Expression of Interest

What data are requested in the Expression of Interest stage?

You can see the questions asked at the Expression of Interest stage in the “Data_Required_GCBR_Expression_of_Interest_Suppl_Data” file that accompanies the publication “Global Core Biodata Resources: Concept and Selection Process”.

My biodata resource charges a fee for some services we offer. Does this mean I can’t apply, given that one of the Eligibility Criteria stipulates it must be free of charge?
Provided that all users can access all the data, free of charge, then your biodata resource fulfils that eligibility criterion. There is a useful distinction to be made between the data housed in a biodata resource and the services the biodata resource provides to its users. While a resource may legitimately charge a fee for specific bespoke services, such as developing APIs or generating specific customised bulk downloads, the principle of open data includes that the data itself should be freely available to all users.
For deposition databases, what does “accept deposition …. from the wider international community” mean? How wide does it need to be?
Where a deposition database serves as a repository of record, it must comprehensively serve the archiving needs of all scientists generating that data type, across the international community irrespective of the location, funding or institutional context of those researchers. This contrasts with a position where a deposition database accepts depositions from, for example, only those researchers responsible for the establishment and running of the biodata resource itself. In such limited circumstances the deposition database would not qualify as a Global Core Biodata Resource.
Why is the call for applications limited to biodata resources made available to users in the English language?
This requirement reflects the wide recognition of English as the global language of science. Subsequent rounds of selection may expand to include biodata resources that do not provide a user interface in English, as the selection process matures and the characteristics of the global biodata infrastructure emerge.
My biodata resource is part of a consortium. Should my application be made as part of the consortium or as an individual biodata resource?
Biodata resources that operate both independently and as a part of a larger umbrella collaboration or consortium will need to consider the context in which to submit their application. Consultation between members of the consortium, and their funders, will be necessary so that all parties are aware of, and have agreed to, the strategy being adopted for application. Where both the independent biodata resource and the consortium are deemed to have qualities in line with GCBR status across the quantitative and qualitative indicators used to define that status, it is legitimate to submit two Expressions of Interest, one for the individual biodata resource and one for the consortium. The details in “Data_Required_GCBR_Full_ Application_Suppl_Data” will inform that assessment, as applicants invited to proceed to the Full Application stage will be expected to supply information for all Indicators, at that time. The Expression of Interest template includes a short answer question where the reasoning behind the application strategy should be explained.

Full Application

What data will be requested at the Full Application stage?

You can see the questions asked at the Full Application stage in the “Data_Required_GCBR_Full_ Application_Suppl_Data” file that accompanies the publication “Global Core Biodata Resources: Concept and Selection Process”.

Why do we need to supply three years of Data Resource Usage data?

The characteristics of a GCBR include that it provides a well-established and stable service. Requesting an average, or a single year’s data, would not allow the reviewers to see the usage of the data resource as a trajectory over time. If you do not have a particular data point for your biodata resource for one/more of the Data Usage items requested (for example some biodata resources collect usage data every 2 years, rather than every year) please include those that you do have, and include an explanatory note.

Is there any guidance for the various detailed data points requested for Indicator 2a Data Resource usage?

Appendix 2 “Methods for Community and Service Indicators” in this article will be helpful for working up data entries for Indicator 2a:

Stockinger, H, et al., (2018). Plan for collation of metrics and quality data at the ELIXIR Hub.

Why is there a choice between providing Web Analytics and Log Analytics for Usage data?

In order to evaluate each applicant biodata resource it is necessary to ascertain the scale of usage of each biodata resource. The GBC does not wish to cause unnecessary work for any biodata resource to provide this data. Some biodata resources operate in institutes or other settings which routinely gather usage data as part of their regular operations, either using web analytic or log analytic approaches, and therefore we accept either set of data for this purpose. We recognise that the technical differences between these approaches make it challenging to compare between different methodologies, but so long as the biodata resource presents data across the three years requested using the same approach for each year, then an understanding can be reached within the constraints of the technology used.

Why do so many Tables in Indicator 2 mention Unique IP addresses?
The “Unique IP addresses” indicator is used as a proxy for the number of individuals accessing the web site. It is not possible to accurately measure the number of individuals accessing without requiring a specific login, which is not a situation that applies for open data resources, and even then that measure is imperfect. Unique IP addresses is itself an imperfect estimate for a variety of reasons including that almost all users have multiple IP addresses because they use multiple devices to access the data resources and/or because of dynamic IP addressing. Conversely, multiple users may appear to have the same IP address if the institution within which they work configures its system to show only one or a few IP addresses to the outside world.
Why do you ask for user data in terms of geographic distribution in Indicator 2?
This quantitative data can be derived from the Unique IP addresses data, and is necessary to assess the degree to which the biodata resource is used globally. Applicants should take care that data is given at the country level only, as mappings given at a finer level of resolution may breach local data protection regulations.