[Q-e-developers] ALL electron and PSEUDOPOTENTIAL data base

Erica Vidal ericavidal at gmail.com
Wed Feb 8 15:28:32 CET 2012


Dear Stefano,

I'm not saying you do not need a database, I'm saying is that in order
to design
it properly, you must first make it clear which data must be stored and how
it intends to use that information.

For me it is essential that you begin to separate data and data
structures. That
is why I propose to call them differently, so you notice the difference.

It is important not to try to think at the same time what information must
be stored and how to store that information. That processes are
related but that
must be done in different steps or you run the risk of doing something that
is not general enough, or make many mistakes.

As in any software project (more or less) first comes the what (
requirements), then comes the how (design), and then the implenetación,
testing and validation.

So in this first meeting, I think you should discuss what information to
store, why and how to use it. The design stage is after that and is about to
choosing the appropriate data structure to the requirements, which I do not
think prudent to discuss it at a first meeting.

I'm trying to save you much time, avoiding to discuss issues at a time when you
should not.

Best,


-------- Original Message --------  Subject: Re: [Q-e-developers] ALL
electron and PSEUDOPOTENTIAL data base  Date: Wed, 08 Feb 2012 12:37:21
+0100  From: Stefano de Gironcoli <degironc at sissa.it> <degironc at sissa.it>  To:
Erica Maran <maran at sissa.it> <maran at sissa.it>, Stefano Baroni
<BARONI at SISSA.IT> <BARONI at SISSA.IT>  CC: q-e-developers at qe-forge.org, Layla
Martin-Samos <lmartinsamos at gmail.com> <lmartinsamos at gmail.com>, Emine
Kucukbenli <kucukben at sissa.it> <kucukben at sissa.it>, Oliviero Andreussi
<oliviero.andreussi at epfl.ch> <oliviero.andreussi at epfl.ch>, Marco
<marco.monni at dsf.unica.it> <marco.monni at dsf.unica.it>

Dear Erica
    I'm not sure my idea is correct but I I think that what we need is
more a database than a dataset.
    what we want to collect are results of a series of calculations,
that can contain energies, pressure, atomic positions, value of specific
band in a certain point... since some substantial effort would be spent
in generating these data I would keep the full output(s) in the
database, properly organized (... this is the part I don't have any idea
on) and then the required information could be retrieved by
interrogating the system... rather than extract the information and then
throw away the outputs (... I always am afraid that one could forget
some important info that then is not anymore available)
    is this compatible with your idea of a dataset ? does it make things
more complicated ?

    I think we should discuss also this issue in the meeeting,

    best

stefano





On Wed, Feb 8, 2012 at 12:16 PM, Erica Vidal <ericavidal at gmail.com> wrote:

> Dear all,
>
> I need to understand us better, define some terms. From what I've talked
> to Layla, Emine and Marco, the first thing you need to assemble a large
> dataset.
>
> A *data set (or dataset)* is a *collection of data*, usually presented in
> tabular form. Each column represents a particular variable. Each row
> corresponds to a given member of the data set in question. Nontabular data
> sets can take the form of marked up strings of characters, such as an XML
> file.
>
> And while it is incorrect to refer to the dataset as a *database*, I
> prefer to understand better to restrict this word to refer to the *data
> structure*, rather than all of the data itself. The design, construction,
> and maintenance of a complex database requires specialist skills.
>
> So the first thing is to* build** the dataset* or rather decide which
> variables are in each column, then with that information and more specific
> requirements relating to searches in the dataset, I can d*esign the
> structure of** the relational database*. Another step is to *build the** *
> *database*. And another step could be to *build** **an interface* for anyone
> can query the database.
>
> As I understood this first meeting to decide which variables or columns
> is needed in the dataset. For that, it helps you have a clear idea of
> what you need to compare, or what the problem is you plan to solve once
> you have ordering information. In other words, what information you will
> need to extract from the dataset that it is too hard to find now?
>
> Best Regards,
>
>
>
> On Wed, Feb 8, 2012 at 10:03 AM, Layla Martin-Samos <
> lmartinsamos at gmail.com> wrote:
>
>> Dear all, we have started an ambicious project that consist in the
>> building of a huge data base of all-electron calculations on model systems.
>> The data base will also include in the future pseudopotential calculations.
>> Marco Monni is performing the all-electron calclations and Erica Vidal is
>> in charge for constructing the informatic framework. After many
>> discussions, we have a first embrionary idea of what should be included in
>> the data base. We would like to organize an informal meeting for deciding a
>> first draft of the data-base structure. How many of you are interested in
>> participating? could you give your disponibilities for the month? Once I
>> know the participants and approx. disponibilities I will open a doodle
>> pool. For people abroad we can arrange skype calls or video conf.
>>
>> thank you for your collaboration
>>
>> best regards
>>
>> Layla
>>
>> PS Emine and Oliviero can you update your contact info in the
>> q-e-developers mailing list?
>>
>
>
>
> --
> Erica Vidal
>



-- 
Erica Vidal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.quantum-espresso.org/pipermail/developers/attachments/20120208/f136a22f/attachment.html>


More information about the developers mailing list