<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://docs.uabgrid.uab.edu/w/index.php?action=history&amp;feed=atom&amp;title=Data_Management_Framework</id>
	<title>Data Management Framework - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://docs.uabgrid.uab.edu/w/index.php?action=history&amp;feed=atom&amp;title=Data_Management_Framework"/>
	<link rel="alternate" type="text/html" href="https://docs.uabgrid.uab.edu/w/index.php?title=Data_Management_Framework&amp;action=history"/>
	<updated>2026-05-10T05:12:10Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.38.2</generator>
	<entry>
		<id>https://docs.uabgrid.uab.edu/w/index.php?title=Data_Management_Framework&amp;diff=3517&amp;oldid=prev</id>
		<title>Jpr@uab.edu: Create draft of framework from email thread</title>
		<link rel="alternate" type="text/html" href="https://docs.uabgrid.uab.edu/w/index.php?title=Data_Management_Framework&amp;diff=3517&amp;oldid=prev"/>
		<updated>2011-12-12T17:40:59Z</updated>

		<summary type="html">&lt;p&gt;Create draft of framework from email thread&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;The Research Computing Platform supports defining and curating data sets for use by researchers directly or by reference in data analysis packages.&lt;br /&gt;
&lt;br /&gt;
== Philosophy of Data Management Framework ==&lt;br /&gt;
&lt;br /&gt;
Data sets should be treated in a manner akin to apps.    Different apps can have different admin/owner groups, organized by app.  An app is a work product whose outcome is a curated application install.  These apps go in /share/apps/&amp;lt;apptag&amp;gt;  and the permissions are defined based on the group maintaining the app.&lt;br /&gt;
&lt;br /&gt;
Similarly, data sets should be considered as work products whose outcome is a curated data set.  As with applications, there is no single group that will manage all data sets.  Data sets should be organized in /luster/projects/public-datasets/&amp;lt;datasettag&amp;gt; (or better /lustre/data/&amp;lt;datasettag&amp;gt;).  Permissions on /lustre/data/&amp;lt;datasettag&amp;gt; should be based on people who are agreeing to maintain a specific &amp;lt;datasettag&amp;gt;.  Some users will be admins on multiple data sets; some groups may bundle a bunch of data sets under one datasettag, others may prefer a strict separations dictated by upstream sources or orgs. (Think github here.)&lt;br /&gt;
&lt;br /&gt;
== Galaxy Example ==&lt;br /&gt;
&lt;br /&gt;
Considering the [[Galaxy]] application, the current /lustre/project/galaxy/public-datasets fits into the above model if you think of this as a curated data set for Galaxy where the dataset admins have chosen to treat a number of distinct data sets as part of a single collection.  This also facilitates developing datasets with additional artifacts that support inclusion in select tools, e.g. a &amp;quot;galaxy public data set&amp;quot;.  It also supports layering dataset products so that one data set might just be the metadata associated with hooking another data set into specific tools.&lt;br /&gt;
&lt;br /&gt;
This organization of apps and datasets helps us treat them as similar abstractions with similar management/curation/oversight demands.  It also let's us map Galaxy's needs more clearly into an environment that is consistent across tools.&lt;/div&gt;</summary>
		<author><name>Jpr@uab.edu</name></author>
	</entry>
</feed>