![]() Instituto Gulbenkian de Ciência |
Building and Managing Biological Collections Databases
|
Course structureThe course will consist of a series of alternating lectures, practical exercises and discussion sessions. The discussion sessions will be used to draw conclusions from the practical work, and the lectures will prepare students for the exercises and reinforce and supplement the information obtained from them. Attendees will work in small groups on the practical exercises. Summary of course content· Introduction to databases and their uses in biodiversity research · Practical design considerations, types of data, uses and users · Web-based databases and commercial database systems · Database design, choice of software, and implementation · Managing institutional data centres, data standards, data quality · Integration, web publishing, interoperability and networked database projects Course planMONDAY morningIntroduction10:00hs - Lecture 1: What is a database? (RW)- What is a database? - What are they used for? - database system components - database architectures 11:00hs - Coffee break11:20hs - Practical Exercise 1a: Web search for database characteristics- typical uses of databases in biodiversity research and for biological collections - essential characteristics of a database system 12:00hs - Discussion 1a: Participants’ own application areas, projects, objectives and needs12:30hs - Practical Exercise 1b: Locate biodiversity information systems on the Web- tabulate for each one: - Organization, name, URL, objectives (what is it for?) and intended users 13:00hs – LunchMONDAY afternoon14:30hs - Discussion 1b: Presenting the findings on organisations, objectives and users15:30hs - Lecture 2: Biodiversity database systems (RW & ED)- biodiversity informatics - data level (nomenclators, checklists, species databases, specimen databases) - demonstrations of some systems on the Web 16:00hs - Coffee Break16:20hs - Practical Exercise 2: Searching for species information- attempt to discover some information about a small number of named plant (or animal) species 17:20hs - Discussion 2: Problems with database systems- user interfaces - unreliability which arises from inadequate handling of synonyms and other deficiencies 18:00hs – CloseTUESDAY morningBiodiversity data systems: information content, uses and users10:00hs - Lecture 3: Biodiversity data types (RW & ED)-
nomenclatural
data -
curatorial
data - geographical data, maps - descriptive data - images - bibliographic data 11:00hs - Coffee break11:20hs - Practical Exercise 3: Investigate selected biodiversity information systems- Select databases from exercise 1b which match your interests - add the following columns to your previous table: types of data contained, how it is presented, presence or not of complex search - evaluate data content and user interface with good and bad points 12:20hs - Discussion 3: Usability of biodiversity information systems- data types found - user interface features (conclusions might include importance of good internal design, not immediately obvious from the user interface). - does the database present the right information in the right way for the intended uses and users? 13:00hs – LunchTUESDAY afternoonDatabase design14:30hs - Lecture 4: Data modelling and the relational model (ED)- entities - ER diagrams - relational model 15:15hs - Practical Exercise 4: Plan a database for a particular application- in groups of 4-6 - decide the attributes and entities - attempt normalisation into an appropriate set of tables - use Access to produce a structure diagram (2 or 3 tables) - produce an ER diagram using PowerPoint 16:00hs – Coffee Break16:20hs - Discussion 4: Database designs- present the designs, - discuss their pros and cons 17:00hs - Lecture 5: Models of species diversity information systems (RW)- the taxonomic core (synonymic indexing etc.) - data models (more detail on data standards later)
WEDNESDAY morningImplementing a data management system10:00hs - Lecture 6: Defining needs (RW)- communicating with users (needs), potential suppliers (solutions) - why needed? - defining needs - for personal or institutional use - for managing a biological collection - for running a web-based biodiversity information system 10:30hs - Discussion 6a: Users and uses- from your own experience and objectives, suggest users and uses 11:00hs – Coffee Break11:20hs - Practical Exercise 6: Evaluate existing biological database management systems- make a list of systems to evaluate - make a list of possible system features (needs, standards, etc.) - fill in a table of system characteristics 12:20hs - Discussion 6b: DBMSs available for biodiversity databases13:00hs – LunchWEDNESDAY afternoon14:30hs - Lecture 7: Setting up a database management system (RW)- alternatives: choose existing package or build new one (based on underlying dbms), using tools - different types of systems (stand-alone, client-server, web-based) - Difference between generic commercial packages (e.g. MS Access) versus specialised biological database packages (BG-Base, BG Recorder, Lucid, Alice, etc.) - getting more information: pointers to some existing systems, web sites with reviews, discussion lists, etc. 15:15hs - Practical Exercise 7: Choice of database management software- draw up one or more specifications (sets of requirements for a project), - evaluate several possible DBMSs - for each specification, choose the best DBMS to meet the requirements - consider how it can be deployed or implemented 16:00 – Coffee Break16:20 - Practical Exercise 7 (Cont.)17:00hs - Discussion 7: Choice of DBMS18:00hs – CloseTHURSDAY morningData Management10:00hs - Lecture 8: Data centre management (ED)- a case study of record systems for living collections 11:00hs – Coffee Break11:20hs - Discussion 8a: Discuss your own institution- aims - facilities - deficiencies - improvements 12:00hs – Research Talk – Linking Biodiversity Databases – Dr. Richard White13:00hs – LunchTHURSDAY afternoon14:30hs - Practical Exercise 8: Plan the implementation of an information system or data centre- Including resources, staff requirements, timetable, network infrastructure, etc. 15:30hs - Discussion 8b: Implementation plans16:00hs – Coffee Break16:30hs - Lecture 9: Data standards (ED)- living collections (ITF, standard used by zoos) - herbariums (HISPID) - other TDWG standards 17:15hs - Practical Exercise 9: Recognise and classify published biodiversity data standards17:45hs - Discussion 9: Using data standards: adopt, adapt or develop?18:00hs - Close FRIDAY morningPublishing and networking biodiversity information10:00hs - Lecture 10: Data quality in biodiversity databases (ED)- data integrity (levels of scrutiny, methods for “data cleansing”) 11:00hs – Coffee Break11:20hs - Lecture 11: Assembling and publishing species diversity databases on the Web (RW & ED)- Linking and merging databases - Online databases, interfaces and gateways - Generating static or dynamic HTML pages (LegumeWeb versus AliceWeb approaches) - Introduce and demonstrate the output pages AliceWeb can generate (ED) 12:00hs - Practical Exercise 11: Evaluate web-based information delivery systems- use the ILDIS web site to evaluate both AliceWeb and LegumeWeb - other systems 12:40hs - Discussion 11: Pros and cons of static and dynamic web page generation13:00hs - LunchFRIDAY afternoon14:30hs - Lecture 12: Interoperability, networking and cooperative projects (RW and ED)- BG networks - Australian Virtual Herbarium - GSDs - GBIF - e-Science and the GRID 15:15hs - Practical Exercise 12: Investigate current research projects and test their prototypes- ILDIS ("intelligent linking") - Species 2000 - GRID projects 16:00hs – Coffee Break16:20hs - Discussion 12: Conclusions and future steps17:00hs - Close
Copyright © 2002 by Richard White, Eduardo Dalcin and the Instituto Gulbenkian de Ciência. All rights reserved. | |