Skip to Main Content

ff 6.2.23

June 08, 2023
  • 00:00I am so grateful to Caitlin
  • 00:03Thugmorton for joining us today,
  • 00:06having just recently submitted an
  • 00:08RO one with the new DMS policy,
  • 00:11I can tell you how easy
  • 00:13it is to run afoul of it.
  • 00:15And So what we wanted to do for
  • 00:17members of the Janeway Society is
  • 00:19give you a little bit of education
  • 00:21around the data management
  • 00:23guidelines and sharing expectations.
  • 00:25And we're lucky to have Caitlin with us,
  • 00:28who's a data librarian for
  • 00:29the Health Sciences.
  • 00:30She received her Master's in
  • 00:32Library and Information Science
  • 00:33from the University of Washington,
  • 00:35a BA in English and Spanish
  • 00:37from Covenant College,
  • 00:38and she provides really important data
  • 00:40support services to students, faculty,
  • 00:41staff, researchers and clinicians.
  • 00:44Across her medical campus,
  • 00:46her research interests include research,
  • 00:49data management, data literacy,
  • 00:51and Open Access.
  • 00:52And so Caitlin,
  • 00:53we are so grateful to you for
  • 00:55presenting this and I hope that
  • 00:56people come today with lots of
  • 00:59questions to ask you as they
  • 01:01navigate these requirements.
  • 01:04Absolutely. Thank you for that
  • 01:05kind introduction, Keith,
  • 01:06and I'm actually really happy
  • 01:07that you brought up my long ago
  • 01:10background in English and Spanish.
  • 01:11Sometimes I like to highlight that for
  • 01:13anyone who feels really overwhelmed
  • 01:15by these new data policies and
  • 01:16all the things that go into data
  • 01:18management and some of the things
  • 01:20around programming and data science,
  • 01:22I improve that.
  • 01:23It's possible that you can come
  • 01:24from a humanities background and
  • 01:26really become quite data proficient.
  • 01:29So if that's freaking you out a little
  • 01:30bit and you want to talk to me about it,
  • 01:32and please feel free.
  • 01:33So I'm excited to be here to talk
  • 01:35to you about fulfilling new data
  • 01:37management and sharing expectations.
  • 01:38We're mostly gonna focus on the
  • 01:40National Institutes of Health,
  • 01:41new data management and sharing policy.
  • 01:43But I'm gonna talk a little bit just
  • 01:46briefly about some other policies that
  • 01:47are in the works that might be affecting you.
  • 01:50And I really do like my presentations
  • 01:52to be engaging and lots of questions.
  • 01:54So if you have questions,
  • 01:55please do not hesitate to stop
  • 01:57me as I'm going along.
  • 01:59You can put something in the chat or raise
  • 02:01your hand or just go ahead and unmute.
  • 02:03But I'll also save some time
  • 02:04for questions at the end.
  • 02:05All
  • 02:08right. So like I said,
  • 02:10we're going to talk about the new NIH
  • 02:12policy and what it means for you.
  • 02:14I'm going to give you lots of
  • 02:16advice about how to write those data
  • 02:17management and sharing plans when
  • 02:19you're putting together your grant.
  • 02:20And I'm also just going to give you
  • 02:22some general strategies for how you
  • 02:24can succeed once you've actually done
  • 02:25the grant submission and you're moving
  • 02:27on into the actual research process.
  • 02:29And then like I said,
  • 02:30we will also leave plenty of time for Q&A.
  • 02:34OK. So Keith just mentioned that
  • 02:36he has worked on one of these.
  • 02:37So I'd actually like to see a show of hands.
  • 02:40You can use the raise hand.
  • 02:41Feature and zoom.
  • 02:42If you have worked on a data
  • 02:44management plan and some capacity
  • 02:46and perhaps you've assisted a π,
  • 02:48perhaps you've done this yourself.
  • 02:50But it could have been this year,
  • 02:51it could have been formerly.
  • 02:53It looks like there's a good number of you,
  • 02:55so this is exciting to see.
  • 02:57So some of you've already had some exposure.
  • 03:00To this,
  • 03:00and I'm actually looking through
  • 03:02the participants now and I see a
  • 03:04handful of familiar names and faces.
  • 03:06So hi to you all and to those
  • 03:08of you I haven't met yet,
  • 03:09I hope to meet you at some point
  • 03:11in the future, but this is great.
  • 03:13So some of you have already
  • 03:15had some exposure to this.
  • 03:16So now I want to ask, now that you've had it,
  • 03:19if you've had some exposure to this,
  • 03:21how did it go?
  • 03:22You can use the clapping emoji
  • 03:24if you thought it went great.
  • 03:26Are pretty neutral about it.
  • 03:27You can use the thumbs up and if you
  • 03:30didn't think that it went so great,
  • 03:32you could use the surprise emoji.
  • 03:35Okay.
  • 03:35I see a few of you didn't maybe
  • 03:37have the best time.
  • 03:38So I hope that this will help
  • 03:40you have a better time.
  • 03:42And the next time you go forward with this,
  • 03:45I know that this has been overwhelming.
  • 03:46Some of the instructions are not so clear.
  • 03:50Some of parts of the process can feel
  • 03:52a little overwhelming and that's okay.
  • 03:54So.
  • 03:54And feel free to ask questions
  • 03:56about that as we go along.
  • 03:57If anything I say makes it more confusing,
  • 04:00please jump in.
  • 04:01And at the end,
  • 04:02if anyone wants to talk about
  • 04:04what their experiences have been
  • 04:05like or what you found especially
  • 04:07kind of difficult to wrinkle,
  • 04:09I'd love for you all to share that.
  • 04:10It's great for you all to learn
  • 04:12from each other and I think more of
  • 04:14kind of sharing in the space can
  • 04:16only improve all of our strategies.
  • 04:18So great,
  • 04:19appreciate you all interacting with me there.
  • 04:22OK,
  • 04:23so we're going to jump in and I'm
  • 04:25going to give you guys the Cliff
  • 04:27notes kind of quick version of what
  • 04:29this all means, what it all looks like,
  • 04:31and then I'm going to get into
  • 04:33some of the
  • 04:34more nitty gritty.
  • 04:36So first I just like to set the tone
  • 04:38and kind of the baseline for why,
  • 04:40why is all of this happening?
  • 04:42And the thick of it, you know,
  • 04:44it just really feels like a lot of rules
  • 04:45and a lot of work in another document
  • 04:47that you need to complain and submit.
  • 04:49But there has been a lot of
  • 04:51thinking that has gone into why
  • 04:53these new policies have come forth.
  • 04:55So the first one is really just this
  • 04:57is an update to an existing policy that
  • 05:00the NIH already had starting in 2003.
  • 05:03Also, I see someone's hand is still up.
  • 05:06Is that a question for me or is
  • 05:07that left over from the previous?
  • 05:12Give it a second if you want
  • 05:14to jump in with a question.
  • 05:16I think it was just left over.
  • 05:17No worries, Okay.
  • 05:19So this is really just a continuation of
  • 05:22a policy that the NIH started in 2003,
  • 05:25so that was 20 years ago.
  • 05:27This was a much needed update.
  • 05:30People like me and other data experts
  • 05:32and data curators and people who
  • 05:34run data repositories have wanted to
  • 05:36see a stronger policy in this arena
  • 05:38from the NIH for quite some time.
  • 05:40NIH actually kind of started to
  • 05:41give hints that they were going
  • 05:43to be releasing an update to
  • 05:45this policy in the 2018-2019.
  • 05:47Time frame.
  • 05:47And then they released their draft in 2020.
  • 05:51Of course, a lot of us were
  • 05:53overwhelmed with other things in 2020,
  • 05:54Not exactly the best time and
  • 05:56to release your draft policy.
  • 05:57But they did give people about
  • 05:59three years to kind of address,
  • 06:01adjust to this new idea,
  • 06:03read through the policy,
  • 06:04provide comments and kind of come to
  • 06:05terms with what all of this would mean.
  • 06:07So it's not really coming out of nowhere.
  • 06:09It's had a trajectory and kind
  • 06:11of a history associated with it.
  • 06:13And then Speaking of 2020,
  • 06:15COVID has really accelerated kind
  • 06:18of the appetite for these policies.
  • 06:20The availability of open data,
  • 06:22lots of data sharing between
  • 06:25scientists accelerated a lot of
  • 06:26things in the medical community.
  • 06:28And I think maybe the NIH really feel
  • 06:30like they had to capitalize on that momentum.
  • 06:33And then finally,
  • 06:33this just really seems to be a hot topic.
  • 06:36It's not just the NIH.
  • 06:37Who's coming out with data sharing policies
  • 06:40and updated data management expectations?
  • 06:42Lots of other federal agencies
  • 06:44are joining forces with this.
  • 06:46The EU has also seen quite a
  • 06:48lot of expansion in this area,
  • 06:50as have other countries outside of the US.
  • 06:52So this isn't an isolated thing.
  • 06:54If anything,
  • 06:55it seems to be accelerating.
  • 06:56And then finally,
  • 06:58I just want to give you the couple
  • 07:00of things that the NIH said about
  • 07:01this and why they're doing it.
  • 07:03They really want to see the
  • 07:05acceleration of results into knowledge.
  • 07:07They really want to maximize the value
  • 07:10of and trust in scientific data.
  • 07:12And they feel that, you know,
  • 07:13making the things that the public has
  • 07:15funded available to the public is important.
  • 07:17And ultimately they want
  • 07:18to improve human health,
  • 07:19which I think is a goal that
  • 07:21we can all rally around,
  • 07:22even in the sick of it,
  • 07:23even when some of these things
  • 07:25feel a little redundant,
  • 07:27a little overwhelming,
  • 07:27a little intimidating.
  • 07:32Okay, so this has already happened.
  • 07:34In case you were not aware,
  • 07:36the new policy has already gone into effect.
  • 07:38It went into effect on the 25th
  • 07:40of January and it now affects all
  • 07:42future research new applications
  • 07:44for grants with the NIH.
  • 07:46So if you're doing renewals,
  • 07:48this may not affect you,
  • 07:49but for pretty much it.
  • 08:07Show that my slides are made available
  • 08:09to you all so you can get all the
  • 08:11links to all the sources and resources
  • 08:12and that I'm referencing here.
  • 08:14But eventually the policy
  • 08:15comes down to two parts.
  • 08:17The first happens before
  • 08:18you ever get the award,
  • 08:19and that means that you have to submit
  • 08:21the data management and sharing a plan.
  • 08:23So that's step one.
  • 08:24And then Step 2,
  • 08:25if you are actually awarded the grant,
  • 08:27is that you then have to implement
  • 08:29everything that you've described in
  • 08:30that data management and sharing plan.
  • 08:32This is a really, really content
  • 08:35simplified version of what the policy means.
  • 08:38There's obviously way more work
  • 08:39than just these two steps.
  • 08:41Just writing the plan alone can
  • 08:42take some time.
  • 08:43And then of course,
  • 08:44actually carrying out the plan
  • 08:46over the course of a multiyear
  • 08:47grant is quite a bit of work.
  • 08:49But essentially this is the kind of help
  • 08:51you see that there's really just these.
  • 08:53That you may come out that
  • 08:55the NIH is requiring,
  • 08:57even though they have a lot of kind of
  • 08:59extra cities that come alongside them,
  • 09:01but this is just to give
  • 09:02you the really quick,
  • 09:03quick snapshot of what it entails.
  • 09:06So what is the data management
  • 09:09and sharing plan?
  • 09:10It sounds like some of you have
  • 09:12have experienced these before and
  • 09:13or have worked on one or worked
  • 09:15alongside A colleague with one,
  • 09:17but the NIH has a really specific
  • 09:19definition of how they want these to look.
  • 09:21But in general,
  • 09:23data management and sharing plans
  • 09:25essentially encompass how you're
  • 09:26going to handle and keep track of
  • 09:28your data throughout the life of a
  • 09:30project and then what you're going to
  • 09:31do with that data at the end of the project.
  • 09:33So that's more or less in a
  • 09:35nutshell what these entail.
  • 09:36But specifically,
  • 09:37what the NIH wants to see is
  • 09:39they want to see six elements.
  • 09:41The first one is your data types.
  • 09:43So what are you going to actually generate?
  • 09:46Are you going to generate survey data?
  • 09:48Are you going to generate clinical data?
  • 09:49Are you going to generate Omics data?
  • 09:51They want you to break out into
  • 09:53categories the different kinds of
  • 09:54data that you're going to generate.
  • 09:56They want you to talk about
  • 09:57approximately how many samples you're
  • 09:59going to have of each data type.
  • 10:00And sometimes they want to
  • 10:01know a little bit about like,
  • 10:02is it going to be raw data.
  • 10:04Is it going to be processed data?
  • 10:05What file format is the data going to be in?
  • 10:08They described this pretty
  • 10:09clearly on their website,
  • 10:11but essentially they're just
  • 10:12looking for a good overview.
  • 10:14Can even be a bulleted list of all the
  • 10:17data types that you plan to generate.
  • 10:19The next thing they want you to
  • 10:21talk about is data standards.
  • 10:22This can really run the gamut in terms
  • 10:24of how you decide to interpret this.
  • 10:26But this is essentially if you're
  • 10:27going to adhere to any standards for
  • 10:29the data that exist in your field
  • 10:31or in kind of the data
  • 10:33space that you are within.
  • 10:34So this can be things like standards for
  • 10:36how the data is structured and formatted.
  • 10:39There's a lot of these in like
  • 10:40the Omics and sequencing world.
  • 10:42It can just be that you intend
  • 10:44to adhere to a really particular
  • 10:46type of file format and whether
  • 10:48that's like Csv's or Pdf's or.
  • 10:50Image file type,
  • 10:51and that's really standard in the field.
  • 10:53This can also be way more specific.
  • 10:54You can get into the world of
  • 10:57vocabulary and taxonomy essentially
  • 10:58like how you actually, you know,
  • 11:00write out terms that are covered in the data.
  • 11:02So there are a lot of options for
  • 11:05this and you're feeling like you
  • 11:06really want to explore this area.
  • 11:07Please get in touch with me.
  • 11:09I have a lot of a lot of resources to
  • 11:12share with you on this and a lot of.
  • 11:14Kind of guidance to offer,
  • 11:15particularly from the library and
  • 11:16standard because we really care about
  • 11:19standards in our world over in the library.
  • 11:22The next thing that I want you to
  • 11:24talk about is any related tools,
  • 11:25software,
  • 11:26code used to actually generate
  • 11:28or analyze the data.
  • 11:29So this can be anything from like
  • 11:31really big lab equipment that's
  • 11:33necessary to the generation.
  • 11:37And are they just want you to basically
  • 11:40tell people what this involves and this
  • 11:43is really for the reproducibility piece.
  • 11:46So these next two bullet points look really
  • 11:48similar and they're due to preservation
  • 11:51access and associated timeline and access
  • 11:54distribution and reuse considerations.
  • 11:56But they're slightly different I
  • 11:58think of the first one as really kind
  • 12:00of being facing towards whoever is
  • 12:02going to be the reusers of your data.
  • 12:04So this is related to you know
  • 12:06where you're going to store it,
  • 12:07how people are going to access it and
  • 12:09when the data is going to be available.
  • 12:11The fifth piece, access,
  • 12:13distribution and reuse considerations
  • 12:14is really more facing towards.
  • 12:17Participants,
  • 12:17so whoever you're collecting the data from,
  • 12:19and this is particularly oriented
  • 12:22towards human participants.
  • 12:23So consent factors that need to go
  • 12:26into what you're actually releasing,
  • 12:29thinking about,
  • 12:30who's affected by the data release,
  • 12:32thinking about, you know,
  • 12:33is there any kind of confidentiality or
  • 12:36deidentification steps that you need to take?
  • 12:38This is really what that
  • 12:41element is discussing.
  • 12:42And then the final element is just
  • 12:44who's going to oversee the data
  • 12:46management and sharing throughout
  • 12:47the life of the grant.
  • 12:48And typically this is the
  • 12:50principal investigator.
  • 13:12Mutational Biology a few months ago.
  • 13:14It walks you through 10 steps and 10
  • 13:18simple rules for maximizing the plan
  • 13:20and kind of how to go about walking
  • 13:23through the plan and making sure
  • 13:24that you're in compliance each step.
  • 13:26And they have lots of resources
  • 13:28in this article.
  • 13:29And in this article they also have this
  • 13:31really nice figure where they break
  • 13:33out the six elements into 10 steps.
  • 13:35And I personally find this
  • 13:37language much more plain language,
  • 13:38much more user friendly.
  • 13:39They kind of walk you through
  • 13:41the steps of describing the data
  • 13:44and choosing documentation types,
  • 13:45describing your tools and software,
  • 13:47using standard file types,
  • 13:49understanding your options for preservation,
  • 13:51potentially finding a data repository,
  • 13:53which I'm going to talk more about.
  • 13:54Coordinating timelines for data sharing,
  • 13:56protecting privacy,
  • 13:57knowing approvals needed for data
  • 14:00accessibility, and final planning.
  • 14:02So this is, I think,
  • 14:03a much more approachable way
  • 14:05to approach how to do this.
  • 14:08It's a little a little more user
  • 14:09friendly than the language that the
  • 14:11NIH chose for the data elements.
  • 14:12So if you're interested in this,
  • 14:14again,
  • 14:14I really recommend this article
  • 14:16and it's in the references and
  • 14:18I'll make sure you guys get those.
  • 14:20OK,
  • 14:20so now I'm going to talk a little bit
  • 14:21about some other parts of the policy.
  • 14:23The plan is really the biggest
  • 14:25piece and often what people want
  • 14:27guidance on and what kind of
  • 14:29tips for how to get through it.
  • 14:31But I also want to talk about
  • 14:32some other parts of the policy
  • 14:33that are crucial for you to know
  • 14:35about and will definitely affect
  • 14:36the way your research process
  • 14:37goes if you're awarded a grant.
  • 14:39So the first one is that the policy
  • 14:41is now going to require much more
  • 14:44aggressive timelines in terms of
  • 14:46when data is supposed to be shared.
  • 14:49So the first one is at time of associated
  • 14:52publication or at the end of the grant,
  • 14:54whichever comes first.
  • 14:55So what this means?
  • 14:57And that any data that is associated
  • 15:00with a publication that's been generated
  • 15:03during the time of the grants,
  • 15:05that data has to be released
  • 15:07alongside the publication.
  • 15:09And then whatever else doesn't get
  • 15:11released in publications that's
  • 15:13considered relevant findings has to be
  • 15:15released by the time the grant is over.
  • 15:18So what data exactly is the NIH
  • 15:20wanting you to share and release?
  • 15:23It covers all scientific data generated in
  • 15:25the grants and this is their definition,
  • 15:28not the clearest of definitions.
  • 15:30I will be honest,
  • 15:31I wish that they would clarify this a
  • 15:34little bit more and lots of groups have been.
  • 15:36Asking the NIH to do that,
  • 15:37so maybe cross your fingers.
  • 15:39We'll see more clarity around
  • 15:40this in the next year or so,
  • 15:42but the definition here is data commonly
  • 15:44accepted in the scientific community
  • 15:46as a sufficient quality to validate
  • 15:48and replicate research findings,
  • 15:50regardless of whether the data are used
  • 15:52to support scholarly publications.
  • 15:53So essentially.
  • 15:54If the data is really underpins your
  • 15:57findings and would be important for
  • 16:00validation and replication
  • 16:01of what you have discovered,
  • 16:03then they want you to share it
  • 16:05and some some way or another,
  • 16:06whether that's through your
  • 16:08publications or at the end of the grant.
  • 16:10And I'm going to talk more
  • 16:12about ways to share.
  • 16:14So some things that the NIH has
  • 16:17outlined for ways to share.
  • 16:19They have, first and foremost,
  • 16:21really encouraged you to
  • 16:23use data repositories,
  • 16:24and I'm going to talk a little bit
  • 16:25about why data repositories are an
  • 16:27ideal choice if it's available to you.
  • 16:29But they also have other options,
  • 16:31some of which I feel like some of
  • 16:32this has actually been skipped over
  • 16:34in a lot of materials that have been
  • 16:36made available about the policy,
  • 16:37And people have really heavily
  • 16:39focused on data repositories,
  • 16:40and for good reason.
  • 16:41They are really the ideal choice.
  • 16:44But there are other options.
  • 16:45Things like data enclaves,
  • 16:46which are really secure places to
  • 16:49put data that usually have a lot
  • 16:51of controlled access features that
  • 16:53require identification of who's
  • 16:54actually going to use the data.
  • 16:56Often pretty strict use and licensing
  • 16:59agreements are entailed with data enclaves.
  • 17:02Sometimes data enclaves are
  • 17:03actually physical,
  • 17:04like you have to go to a physical
  • 17:06location and usually disconnected
  • 17:07from the Internet to access the data.
  • 17:10That's a pretty extreme situation,
  • 17:12but some people do have really sensitive.
  • 17:14Data that would qualify for
  • 17:16this type of situation.
  • 17:17They've also said that it's acceptable
  • 17:20to share data under the offices of
  • 17:23the investigator in some situations
  • 17:25and that really means anything that
  • 17:27you see fit to do and that could
  • 17:30be a lot of different situations.
  • 17:32And then finally they've sent
  • 17:33that it may be mixed modes.
  • 17:34So maybe you'll put 90% of your
  • 17:36data in the data repository,
  • 17:38but then 10% that's highly sensitive
  • 17:40need something else unique that
  • 17:42you may need to put together.
  • 17:44So there's a lot of room here
  • 17:46for taking a lot of different
  • 17:48approaches to data sharing.
  • 17:49I don't think the NIH has wanted
  • 17:51to lock people into, you know,
  • 17:53a box when it comes to this.
  • 17:54But I will say that data repositories
  • 17:56are often your best choice.
  • 17:58So I am going to talk more about those.
  • 18:01Okay.
  • 18:01And then just a couple of other things
  • 18:04to keep in mind about this policy.
  • 18:06So plans are just two pages.
  • 18:08So the plan that you're gonna
  • 18:10submit before you get the grant to
  • 18:12the NIH about your data management
  • 18:14and sharing just two pages,
  • 18:15it's really not a lot of space.
  • 18:19So you have to be pretty brief here,
  • 18:20pretty concise and they are not looking for,
  • 18:23you know, real depth.
  • 18:24They just want you to give a good
  • 18:26breath of what you're going to do.
  • 18:27You're also making this plan before
  • 18:29you even start your research.
  • 18:30So there's only so much that
  • 18:32you can be super descriptive
  • 18:33and comprehensive about.
  • 18:35You're essentially making a plan.
  • 18:37It's not set in stone.
  • 18:38So there's there's some room here
  • 18:41to potentially update,
  • 18:42which I'll talk about in a second.
  • 18:44I also want to point out
  • 18:45that at least at this point.
  • 18:46Plans are not part of scored peer
  • 18:50review criteria unless otherwise
  • 18:51noted in the funding announcement.
  • 18:53There's a handful of departments
  • 18:55of the NIH that are doing that,
  • 18:57but they're pretty rare at this point.
  • 18:59So As for now, this isn't going
  • 19:02to affect your grant funding.
  • 19:04So I like to call this the
  • 19:05anxiety reducing slide.
  • 19:06This is the one that can hopefully help
  • 19:08you take a deep breath and realize,
  • 19:09but you know this is important,
  • 19:12but it hopefully isn't going
  • 19:13to affect your funding.
  • 19:15And then plans can be updated,
  • 19:16so again, not set in stone.
  • 19:18It can be updated over the
  • 19:20course of the grant.
  • 19:21And the NIH has outlined lots of reasons
  • 19:23why you might want to make updates.
  • 19:26And then I also want to point out,
  • 19:27even though I'm not gonna talk
  • 19:28about this in this presentation.
  • 19:30Budgeting for data management
  • 19:31and sharing is really important.
  • 19:33I've linked some resources from
  • 19:35the office of Sponsored projects
  • 19:37and they have a budgeting tip
  • 19:38sheet and lots of other great
  • 19:40resources to help you do this.
  • 19:41And you can talk to people over in
  • 19:43the office of Sponsored projects
  • 19:44to get assistance with this.
  • 19:45But the one thing I will say here
  • 19:47is that me and basically everyone
  • 19:49else in the office of sponsored
  • 19:52product is really encouraging
  • 19:53you to not list $0.00 here.
  • 19:55Data management and sharing it does
  • 19:57not come at no cost. It's a pretty.
  • 20:00Labor intensive process.
  • 20:02Even if you're not actually utilizing
  • 20:04any services that cost dollars,
  • 20:06you're almost assuredly going to
  • 20:08need to cover labor costs for things
  • 20:11like duration and data cleaning and
  • 20:13organizing all of your resources and
  • 20:15packaging things up for publication.
  • 20:17There will almost assuredly be cost,
  • 20:20so consider budgeting at
  • 20:21least something for this,
  • 20:23even if it's not a lot.
  • 20:25And then there are handful of cases
  • 20:27in which the policy doesn't apply.
  • 20:29I have linked resources for this
  • 20:30for you to look at.
  • 20:32And if you think that you might be in a
  • 20:33situation where the policy doesn't apply,
  • 20:35feel free to reach out to me and
  • 20:36I'm happy to help you clarify it.
  • 20:37But it does apply in pretty much all cases.
  • 20:41If you're generating data and
  • 20:42you're getting money from the NIH,
  • 20:44you're probably going to fall
  • 20:46under this policy okay.
  • 20:48So then I quickly want to say something
  • 20:51about kind of what's ahead and what
  • 20:53we might be seeing in the future.
  • 20:55And I'm assuming many of you,
  • 20:56if you're going to be sinking funding,
  • 20:58will probably be getting that from the NIH.
  • 21:00But there are lots and lots and
  • 21:02lots of other funders out there and
  • 21:04lots and lots of other places that
  • 21:06require this sort of thing already.
  • 21:07And if they don't yet,
  • 21:09they may in the future.
  • 21:11So just to give you kind of a quick
  • 21:13understanding of what's been happening here.
  • 21:15So like I said. The policy draft for
  • 21:18the new NIH policy was released in 2020.
  • 21:21Last year in August, the White House
  • 21:23released something called the Nelson Memo,
  • 21:25which came out from the Office of
  • 21:28Science and Technology Policy.
  • 21:29This memo states that all U.S.
  • 21:31Federal agencies must update their
  • 21:34public access policies by 2025,
  • 21:36so that's about 2 1/2 years
  • 21:39away at the latest.
  • 21:40This new memo is going to
  • 21:44remove publication embargoes.
  • 21:45And it's also going to require
  • 21:47more open data sharing.
  • 21:48And so I'll talk a little bit more
  • 21:50about that on the next slide.
  • 21:52And then the new NIH policy is in effect.
  • 21:56And then the NIH has already
  • 21:58responded to this new White House
  • 22:00memo about the public access policy,
  • 22:02basically saying that their new policy
  • 22:04that they rolled out in January 2023
  • 22:06as their version of compliance.
  • 22:09And then of course, by the end of 2025,
  • 22:11we're going to see way more
  • 22:12responses to this memo from all U.S.
  • 22:14Federal agencies.
  • 22:15So this is going to have a pretty big effect.
  • 22:18If this is something that you're
  • 22:20interested in or if you expect to
  • 22:22receive funding from some other
  • 22:23federal US agency,
  • 22:25I encourage you to go to open
  • 22:26dot science.gov They're tracking
  • 22:28basically all the responses and
  • 22:30various public comments and various
  • 22:33plans that are getting released
  • 22:35from federal agencies in response.
  • 22:37To the new memo.
  • 22:38So basically in short,
  • 22:39if you take nothing away from this slide,
  • 22:41this stuff is happening all across the
  • 22:44country and really all across the world.
  • 22:46There seems to be there's a much
  • 22:48stronger trend towards the data
  • 22:50management and data sharing and
  • 22:52lots of funders are coming out
  • 22:54with more policies to enforce this
  • 22:57and to encourage compliance.
  • 22:59So just to quickly say what
  • 23:02the OSTP memo says about data.
  • 23:05So for those with federally funded research,
  • 23:07they want people to be sharing
  • 23:09data freely and publicly at the
  • 23:11time of publication,
  • 23:12unless it's subject to various
  • 23:14limitations like human subjects
  • 23:16sensitivity or perhaps you have a
  • 23:18consented patients or all kinds
  • 23:20of reasons that can potentially
  • 23:21limit you from sharing data.
  • 23:23But in general,
  • 23:24they want people sharing
  • 23:26data freely and publicly.
  • 23:27They want people sharing data
  • 23:29even outside of publications.
  • 23:31And they have a lot of recommendations
  • 23:33about how they want that data to
  • 23:34be shared and what they want those
  • 23:36repositories to look like when the
  • 23:38data is deposited in data repositories.
  • 23:40These guidelines are really excellent
  • 23:42and the NIH more or less use the
  • 23:45same ones for their recommendations.
  • 23:46So I'll be pointing you to those
  • 23:49and the resources.
  • 23:50And I also just want to highlight that
  • 23:53this memo basically says that it's
  • 23:54up to each federal agency that come
  • 23:57up with their own policy and their
  • 23:58own guidance and their own recommendations.
  • 24:01So hopefully there will be a lot more
  • 24:03guidance and a lot more resources out
  • 24:04there over the next couple of years
  • 24:06both. And I just send the example,
  • 24:09they're not always the easiest
  • 24:10to interpret and that's part
  • 24:11of the reason that I'm here.
  • 24:13So hopefully we'll see more resources,
  • 24:15how clear they will or won't be,
  • 24:17No promises. OK.
  • 24:20So I'm going to start to talk about
  • 24:22kind of how to actually do some of this.
  • 24:24That was the overview,
  • 24:25but I want to pass here and see if
  • 24:28there's any questions at this point.
  • 24:37So why do you
  • 24:38think they took this approach rather
  • 24:44than provide guidelines for what to
  • 24:48do with particular types of data?
  • 24:52Yeah, that's it's a really
  • 24:54interesting question,
  • 24:55and to some extent they have.
  • 24:57A little bit. They still have the
  • 25:00genomic data sharing policy which
  • 25:02is specific to that kind of data.
  • 25:05The plans are going to be harmonized
  • 25:08more or less so that basically your
  • 25:10plans will look the same regardless
  • 25:12of data type that you have.
  • 25:13But you do have more requirements
  • 25:15if you have genomic data.
  • 25:17So it's a great question.
  • 25:19I mean to some extent they are doing this by
  • 25:22letting each individual NIH center add on.
  • 25:26Kind of their own requirements
  • 25:27and their own recommendations,
  • 25:29so you should read funding announcements
  • 25:32really carefully and because they
  • 25:34may have extra addons to this.
  • 25:36But I think to some extent it seems
  • 25:39like NH wants to standardize,
  • 25:40and it sounds like they wanted to remove
  • 25:43some of the burden of this paperwork,
  • 25:45like they wanted to make things
  • 25:47shorter and more consolidated.
  • 25:48I'm not really sure that they've
  • 25:51achieved that based on what.
  • 25:52And I have heard from researchers,
  • 25:54it seems like a lot of this is
  • 25:55redundant to some of the other
  • 25:57documents that have to be created,
  • 25:58so there's certainly lots of
  • 26:01room for improvement there.
  • 26:02But I will also say that to some extent,
  • 26:05because the plants are so short,
  • 26:06they're only two pages and you're really
  • 26:08only covering 6 elements in a really quick,
  • 26:11concise summary,
  • 26:13some summative form.
  • 26:15To some extent they are offshoring
  • 26:17some of that.
  • 26:19Data type specific guidance
  • 26:21to Data repositories,
  • 26:23I'm going to talk a little bit more
  • 26:24about that as I talk about Data
  • 26:25Repositories in the next couple of slides.
  • 26:27But most data Repositories have
  • 26:30pretty sometimes rigorous steps
  • 26:32as to how they want data to be
  • 26:35prepared to deposit with them.
  • 26:37So to some extent NIH is just not
  • 26:39covering that at all saying you
  • 26:41know pick a repository if you have
  • 26:43one and then follow their guidance.
  • 26:45So to some extent they are doing that,
  • 26:46they're.
  • 26:47Just handing that off to another
  • 26:49player in the process.
  • 26:51So if that does that help answer
  • 26:53your question,
  • 26:53it's kind of a wishy washy answer
  • 26:56I am with you.
  • 26:57And that I kind of wish that they had
  • 26:59provided more guidance around more data.
  • 27:01Yeah, I guess it just seems to
  • 27:04me that just taking genomic
  • 27:06data and you know as an example,
  • 27:08you know if it's single cell data
  • 27:10it goes here, if it's you know.
  • 27:13RNA seek and chip seek and these
  • 27:16kinds of data they go here,
  • 27:18but maybe that feels too restrictive
  • 27:20because there's always some sort of
  • 27:22exception or something and so they don't
  • 27:24want to dictate quite to that level, but.
  • 27:28Yeah. And I will say that there
  • 27:30are a lot of groups right now
  • 27:32working on that kind of resource.
  • 27:34Like you said, there's potential limitations
  • 27:37when you get really prescriptive about
  • 27:39things because there's always outliers.
  • 27:41But I think that it's very likely
  • 27:43that you're going to see some kind
  • 27:45of resource that looks very similar
  • 27:46to what you just described there.
  • 27:48You say I have this kind of data and they
  • 27:51will give you guidance on how you go
  • 27:53forward with that when it's I I've heard of.
  • 27:56At least three groups at this
  • 27:58point that are trying to put
  • 27:59together a resource like that.
  • 28:00So when something like that's available,
  • 28:02believe me,
  • 28:02I will be telling everyone about it.
  • 28:04So I would love to see something
  • 28:06like that just as much as you do.
  • 28:10Yeah. Shelly, hi. This is really helpful.
  • 28:13I had a couple of questions.
  • 28:15The first one is I had to write
  • 28:17one of these recently and it wasn't
  • 28:19super clear to me what kind of.
  • 28:22The demographic data was required
  • 28:24for so like the last person
  • 28:26probably doing similar stuff,
  • 28:28genomic experiments on
  • 28:30tissue from human subjects.
  • 28:32The sharing of the genomic data
  • 28:34is the easy part, I think,
  • 28:35because we've all done that before.
  • 28:37But it seems to me like we also
  • 28:39had to make a plan for sharing
  • 28:41like principle demographic
  • 28:42information on the subjects.
  • 28:43And I don't know that's difficult
  • 28:46to navigate given our tendencies
  • 28:48to not share those things usually.
  • 28:51So that's my first question.
  • 28:53My second question was,
  • 28:54are folks
  • 28:55like you in the library available to help us
  • 28:58write these plans and should we know
  • 29:02what's the mechanism for doing that?
  • 29:04Yeah. So I'll actually take
  • 29:06your last question first.
  • 29:07So I am available to help review them.
  • 29:09I can't really help write them just
  • 29:12because they're often so specific to
  • 29:14the type of science that you're doing.
  • 29:16But once you think you've gotten
  • 29:17pretty close to a final draft,
  • 29:19I am always happy to review them
  • 29:20and kind of give you some final
  • 29:22pointers for like you know,
  • 29:23maybe you should have included this
  • 29:25or what about this day repository.
  • 29:27So I'm always happy to do kind
  • 29:29of a spot check and help you
  • 29:31with anywhere that you're stuck.
  • 29:33As for your first question,
  • 29:35so yeah, so clinical data,
  • 29:37particularly clinical demographic
  • 29:39data gonna be tricky.
  • 29:42There are some data repositories available.
  • 29:44Visbly is a really good example.
  • 29:46I'm gonna have to be on one
  • 29:48of the slides coming up,
  • 29:51data that comes with some really nice.
  • 29:54Controlled access features.
  • 29:55So they do things like make sure
  • 29:58that people are like at a research
  • 30:00institution before they access the data.
  • 30:02I think they also usually ask them to
  • 30:03submit a proposal so you as the researcher
  • 30:05can do things like, say, you know,
  • 30:07I only want people to reuse this data
  • 30:09if they're doing this kind of research.
  • 30:11So there's a good bit of kind of control
  • 30:13there to protect participants and to
  • 30:16protect yourself and protect the research.
  • 30:18So that's one option.
  • 30:20Another is just doing pretty
  • 30:23intense DE identification.
  • 30:24So there's some repositories out there
  • 30:26that will accept really just about
  • 30:28any kind of data that they're going to
  • 30:30want it to be heavily DE identified.
  • 30:32So that is where I get into the,
  • 30:35the place of re encouraging you
  • 30:36again to think about what are some
  • 30:38things that you might ask for in your
  • 30:40budget that might make that easier.
  • 30:42Do you need to hire a team of,
  • 30:44you know,
  • 30:44a couple post docs to do lots
  • 30:46of DE identification?
  • 30:48Do or you know what,
  • 30:49what do you need there in terms
  • 30:51of that to get that done in a way
  • 30:53that feels feasible and doable?
  • 30:55The other option is that if you
  • 30:57have data that is like really just
  • 30:59feel like it can't be shared,
  • 31:00there's plenty of avenues for that.
  • 31:02NIH has like a kind of almost like a whole,
  • 31:04like flow chart on like,
  • 31:05do I actually need, you know,
  • 31:07can I actually share this data?
  • 31:08And if I can't,
  • 31:09what are my justifications for
  • 31:11not sharing it?
  • 31:12So that's always the potential as well.
  • 31:13Maybe it just isn't a good idea to share it.
  • 31:16In that case,
  • 31:17you just say that in your plan and
  • 31:18you give a justification for it.
  • 31:23Anything else there that I can answer?
  • 31:24If he's went up with the question, go ahead.
  • 31:29Hi Caitlin, thanks for this talk.
  • 31:30Really important stuff.
  • 31:31So I have not done one of these before.
  • 31:35And my question is in regards to is this
  • 31:38for all data like even animal data,
  • 31:41Western Blots, Eliza's, How,
  • 31:43how do you see that being reported?
  • 31:48Yes, I am. So it is all scientific data.
  • 31:51So you know if it's underpinning findings
  • 31:53that you're gonna release kind of as
  • 31:55part of the main findings of the grant,
  • 31:56then absolutely yes that is included.
  • 31:59Pretty much at this point,
  • 32:00the only kinds of data that are
  • 32:02excluded or like data about students,
  • 32:03if you're on like a training or
  • 32:05an education grant from the NIH,
  • 32:07there's really not a lot of exclusion.
  • 32:10Pretty much if you're generating scientific
  • 32:12data, it is going to be covered.
  • 32:14So yes, animal specimens, absolutely.
  • 32:17In fact, I would say at this point
  • 32:18that's the majority of the plans
  • 32:20that I have personally laid eyes on.
  • 32:22The good news is if you're dealing
  • 32:24with animal specimens,
  • 32:25you're probably going to have.
  • 32:27Less issues in the in the realm of
  • 32:29needing to deidentify or having to
  • 32:32deal with kind of confidentiality
  • 32:35sensitivity concerns, but yes,
  • 32:37if you're in the image realm.
  • 32:38That is kind of an ongoing discussion
  • 32:41in the image.
  • 32:42Status here is how much should be shared.
  • 32:44I know Western blots in particular are huge,
  • 32:47and sometimes it's, you know,
  • 32:49uncertain what's really useful
  • 32:51to share for reuse.
  • 32:53So if you want to talk more
  • 32:55about that after today's session,
  • 32:56I'm happy to point you to some resources.
  • 32:58There's some good.
  • 32:59Discussions and good guidance kind
  • 33:01of forming around some of the
  • 33:03stuff and related to image data,
  • 33:05but none of it I would say it's
  • 33:07like cut and dry at this point.
  • 33:08So some of that's just you know
  • 33:10describing to the NIH what you can
  • 33:13do that's reasonable and potentially
  • 33:15coming up with justifications if you
  • 33:17feel there's data that shouldn't
  • 33:19be shared or is you know just
  • 33:21it wouldn't be useful to reshare
  • 33:23and to make available for reuse.
  • 33:30Hope that helps.
  • 33:33Okay other question.
  • 33:36All
  • 33:43right, well with that I think I'm
  • 33:44going to head into a couple more tips,
  • 33:46a few of which I hope will make some
  • 33:48of this a little bit easier for some
  • 33:50of the questions that you all have
  • 33:52brought up or at least give you a
  • 33:54couple of avenues to follow and.
  • 33:56So one of the first things for any
  • 33:57of you who are here who haven't
  • 33:59done one of these before,
  • 34:01I really recommend that you use
  • 34:03some of the tools and templates
  • 34:05that are available out there.
  • 34:07One of the first ones that I'm
  • 34:08gonna recommend here is DMP Tool.
  • 34:10I'm actually going to click on
  • 34:11this and take you out and show
  • 34:12you what this looks like.
  • 34:14So I'm actually logged in right now
  • 34:16to DMP tool and I'm gonna pop in here.
  • 34:20So you log into this,
  • 34:21we license this,
  • 34:22and this is a free resource for
  • 34:24you here at Yale.
  • 34:25You log in with your Yale address and
  • 34:28you are basically able to use the for
  • 34:31free with a yale.edu e-mail address.
  • 34:33And essentially what this is,
  • 34:34is the data management plan generator.
  • 34:38And what's beautiful about this
  • 34:40is if you had over here to
  • 34:42this middle button right plan.
  • 34:44You'll notice that I've already
  • 34:46selected that this is an NIH
  • 34:48General Data Management Engineering
  • 34:50plan for the 2023 new policy.
  • 34:54And because it's selected that it
  • 34:56already knows what the six elements
  • 34:58are that the NIH requires for this
  • 35:00plan and if we click into one of these,
  • 35:02like data type for instance.
  • 35:05And it gives you these really nice,
  • 35:07fill in the blank prompts and I
  • 35:09find these incredibly useful.
  • 35:11But when I'm giving advice to
  • 35:12people on how to write these and
  • 35:14I I've found that a lot of people,
  • 35:15particularly early career people
  • 35:17who maybe haven't had the chance
  • 35:19to do one of these yet,
  • 35:20that this can really help you
  • 35:21go through the process.
  • 35:23They also give you sample answers.
  • 35:25They also give you guidance from the NIH.
  • 35:27They link you out to all the
  • 35:29different parts of the guidance.
  • 35:30They also have their own guidance
  • 35:32that people at CMP Tool have written.
  • 35:35This is run by quite a few of the
  • 35:37universities in California and
  • 35:39the California Digital Library,
  • 35:41although they've made it available
  • 35:42to basically everyone,
  • 35:44and they also have really nice guidance
  • 35:45here on how to go about doing this.
  • 35:47I find in a lot of ways easier to
  • 35:50digest them with the NIH has provided,
  • 35:53so this is really nice.
  • 35:54You can also invite collaborators into
  • 35:56this and when you're done you can export it,
  • 35:59and lots of different formats of the
  • 36:02PDFI can keep them at the CFB if
  • 36:04you really wanted to do it that way.
  • 36:05You can also make it public,
  • 36:07or you can make it private to all
  • 36:09your collaborators.
  • 36:10So there's a lot of kind of extra
  • 36:11built in features here that are really nice.
  • 36:13So this is a good thing to know about.
  • 36:16I totally encourage you to use.
  • 36:19I hear really positive things about
  • 36:21it from people who've tried it,
  • 36:23so if that sounds like it could help,
  • 36:25I definitely recommend using it.
  • 36:28The other thing is that the NIH keeps
  • 36:29making more and more sample plans available.
  • 36:31I think right now they have about
  • 36:34over a dozen available and there
  • 36:36are lots of other ones out and
  • 36:39about on the Internet to review.
  • 36:41DMP tool itself also has lots of
  • 36:43plans that people have made public
  • 36:45available for you to review,
  • 36:46so there's a lot floating about.
  • 36:49I also recommend just consulting
  • 36:51with your colleagues.
  • 36:52You've already heard on this call that
  • 36:53some people have already done this,
  • 36:55so you know,
  • 36:56if you have colleagues who are amenable
  • 36:58to sharing their plans with you,
  • 37:00that can be really helpful to borrow
  • 37:01language from people who've already done it,
  • 37:03particularly if they're researching
  • 37:05really similar things to you.
  • 37:07And I also just point out that your
  • 37:09School of Medicine has a grant library.
  • 37:11And I think Nick probably knows a
  • 37:13little bit more about that than I do.
  • 37:14It probably doesn't have any of
  • 37:16the new NIH grants and it yet,
  • 37:18but eventually it will.
  • 37:20So that's also something to
  • 37:22check out and take a look at.
  • 37:24I also want to just talk a little
  • 37:26bit about kind of this part that
  • 37:29isn't explicitly stated in the plans,
  • 37:32but will definitely be part of the
  • 37:34process and can really catch you
  • 37:36up for sharing data well later.
  • 37:39And that's the data contextualization.
  • 37:41So keeping good notes and good
  • 37:44track of what your data means,
  • 37:46so that when you're asked to share it,
  • 37:49sometimes a year, two years,
  • 37:51three years from when you
  • 37:53actually generated it,
  • 37:54you know what it means and you can
  • 37:56describe it really well to another reuser.
  • 37:59So this includes things like documenting
  • 38:01your data using metadata standards.
  • 38:04And ensuring to data standards
  • 38:06and best practices.
  • 38:07I have a lot of resources for this at
  • 38:09the end of the slides and this could
  • 38:11be a whole presentation unto itself,
  • 38:14so I won't get too far into the weeds here.
  • 38:15But just a note to say that it's really
  • 38:18important to do this one for yourself,
  • 38:19just so that you understand the
  • 38:21data later when you come back to it,
  • 38:22sometimes many months later,
  • 38:24and also for the people who might
  • 38:27be using your data again and future.
  • 38:30And I also want to talk a little
  • 38:31bit about this.
  • 38:32It's kind of already been mentioned a
  • 38:34little bit and some of your questions.
  • 38:37But how do you protect participants
  • 38:39through this process,
  • 38:40particularly human participants?
  • 38:43So one of the first things to really
  • 38:45think about and I think one of the kind of.
  • 38:47Instigation through why this policy
  • 38:49has come about is to really think
  • 38:51about consenting early on.
  • 38:52You want to make sure that you're
  • 38:54really explicit and the consent
  • 38:56documents about what's going to
  • 38:57happen with patient data,
  • 38:59whether they want it to be shared,
  • 39:00how it's going to be shared,
  • 39:01what that sharing is going to look like,
  • 39:03who it's going to be made available to you.
  • 39:04And IH has a lot of good documentation and
  • 39:08guidance on how to go about doing this.
  • 39:11They even have sample consent
  • 39:12language to include on your forms.
  • 39:15But you want to make sure that you're as
  • 39:17upfront about this as possible to protect
  • 39:19your participants and make sure they
  • 39:20understand what they're getting into.
  • 39:22And you also want to go ahead and make plans
  • 39:24for date of the identification if that's
  • 39:26going to be something that you have to do.
  • 39:28That can be a pretty time intensive
  • 39:31process that usually has statistical
  • 39:33components involved with it.
  • 39:34So make sure that you involve the right
  • 39:37people in this and that you budget
  • 39:39accordingly if you're going to need to do it.
  • 39:42And then I also talk a little bit about
  • 39:44this because I think sometimes these
  • 39:45can just feel like requirement after
  • 39:47requirement and just another thing to do.
  • 39:50But there's there's potential kind of
  • 39:53benefit here for you and what this means
  • 39:56for kind of your research outputs.
  • 39:58But first I do, I forgot,
  • 39:59I also want to talk about this to you.
  • 40:01There's some things that you can do
  • 40:03if you're kind of concerned about
  • 40:05your research getting out and about
  • 40:06and potentially into the wrong hands,
  • 40:08particularly if you have really
  • 40:10sensitive research, so.
  • 40:11One of the first things that you can do
  • 40:13is you can look for data repositories
  • 40:15that have access controls or this is
  • 40:17something that you can kind of retrofit
  • 40:18yourself in the data sharing process.
  • 40:20But essentially you kind of,
  • 40:22you know,
  • 40:22keep some of the data back based on criteria.
  • 40:25So maybe you only hand it over to
  • 40:27academic researchers,
  • 40:28maybe you only hand it over to people
  • 40:30doing certain kinds of research.
  • 40:31There can be reasons for that,
  • 40:33particularly if you have really
  • 40:35sensitive data.
  • 40:36I'm thinking of like recently I
  • 40:38heard from somebody with intimate
  • 40:39partner violence data.
  • 40:41I'm covering really sensitive groups.
  • 40:42You know,
  • 40:43that's the kind of data that you
  • 40:45don't want to just openly share unless
  • 40:46it's been heavily be identified.
  • 40:48So if you're going to share it and
  • 40:50it's a real granular format that's
  • 40:52probably going to come as pretty
  • 40:54tight access controls.
  • 40:55You can also utilize data licensing and
  • 40:57data use agreements to really carefully
  • 40:59specify how the data should be reused.
  • 41:02So that's another way to go
  • 41:04about thinking about this.
  • 41:04And most data repositories will already
  • 41:06have lots of data licenses for you to
  • 41:09choose from to add to your data that
  • 41:11users have to agree to before downloading.
  • 41:14And then this is the piece that
  • 41:16I want to talk about that's kind
  • 41:17of the benefit for you.
  • 41:19When we do all these sorts of things
  • 41:21like deposit data in a data repository
  • 41:24and put it in a place that's, you know,
  • 41:26somewhat curated and controlled,
  • 41:28that's going to come with
  • 41:29lots of extra features.
  • 41:30Things like persistent identifiers,
  • 41:32so digital object identifiers or DOI,
  • 41:35something that you're probably used
  • 41:36to seeing on your publications.
  • 41:38Data sets can get Doi's as well and those
  • 41:41are trackable and they can be cited.
  • 41:43So there's some benefits
  • 41:44here and that you know.
  • 41:45You can, while you're releasing
  • 41:47your data set down into the
  • 41:49world for other people to use,
  • 41:51you can kind of track what their
  • 41:52reuse was like much in the same way
  • 41:53that you would with a publication.
  • 41:55And that can go into your metrics
  • 41:57and your impact and go into kind
  • 41:58of what you cite is the output
  • 42:00of your work and your research.
  • 42:04OK, so finally I want to talk about this.
  • 42:07Again, this could probably be its own
  • 42:09whole workshop, and I do actually
  • 42:11have a whole workshop on this.
  • 42:13And so if you're interested,
  • 42:14keep an eye out for that
  • 42:16on the library calendar.
  • 42:17But I really want to encourage you,
  • 42:18if possible, where feasible
  • 42:20and where one exists for you to
  • 42:23take advantage of repositories.
  • 42:24They can really simplify the longterm
  • 42:27preservation piece that the NIH is
  • 42:29asking for you to describe in those plans.
  • 42:32It really takes a lot of the gas
  • 42:34work out of curation and kind of
  • 42:36longterm sustainability of the
  • 42:38data and lots of things that the
  • 42:39NIH wants you to think about,
  • 42:40like how are people going to find it,
  • 42:42how are they going to access it,
  • 42:43how are they going to download it?
  • 42:45Repositories already have all
  • 42:46of that stuff built in.
  • 42:48NIH also has a lot of repositories listed.
  • 42:52So I encourage you to check those out.
  • 42:54And then there's a few more that
  • 42:55I want you to know about.
  • 42:57The first one is Ryan.
  • 42:59We are members of Dryad and Yale
  • 43:02affiliates with the Yale dot Edu e-mail
  • 43:05address get up to 300 gigs of data
  • 43:08with Dryad for deposits and that's per
  • 43:11project and you can have unlimited projects.
  • 43:14So there is quite a bit of room there.
  • 43:17They take just about any data type.
  • 43:19They're really only caveats to Dryad or
  • 43:22that it has to be public domain data.
  • 43:25So you basically have to
  • 43:27release it completely open.
  • 43:28So if you have really sensitive
  • 43:30data that may not work,
  • 43:32but otherwise they are good
  • 43:34for a lot of data types.
  • 43:36This other one is ICPSR which
  • 43:38we are also members of.
  • 43:39This one is more for like public
  • 43:41health data and kind of social justice
  • 43:44oriented health data and they're
  • 43:46originally official sciences repository,
  • 43:48but they have multiple NIH data collections,
  • 43:51so it's possible that you might
  • 43:53find a fit there.
  • 43:54And then a few more that I want to mention,
  • 43:55I already said Bisley is 1/1 that I would
  • 43:58definitely encourage you to check out.
  • 44:00If you're on the clinical front,
  • 44:02this one in the top right corner is
  • 44:04how you get to basically all of the
  • 44:07National Center for Biotechnology
  • 44:09information repositories,
  • 44:10including like SRA, Geo,
  • 44:13lots of Omec resources.
  • 44:15So that's where you can find
  • 44:16all of those in one place.
  • 44:17And in the bottom left corner of these
  • 44:20are all the repositories that I each have.
  • 44:23It's it's a very long list.
  • 44:24They have hundreds and then OSF
  • 44:26is another one that I like to plug
  • 44:29the home for lots of different data
  • 44:31types and also a good one to know
  • 44:33about if you're going to be required
  • 44:35to do pre registration for your
  • 44:37study if they handle that as well,
  • 44:40Okay.
  • 44:40So just a few final things that
  • 44:42I want to say and then I'm going
  • 44:44to open it up again for question.
  • 44:46This is something that the
  • 44:47NIH has on their website
  • 44:48right now, which is that they really
  • 44:50want investigators to benefit from
  • 44:52first and continuing use of their data,
  • 44:55but not from prolonged exclusive use.
  • 44:58So this is really something to think
  • 44:59about as you're thinking about,
  • 45:01well, how do I share?
  • 45:02What do I share and when do I share it?
  • 45:05The truth is that most researchers
  • 45:07want to hold on to their data for
  • 45:09a while because often you can get.
  • 45:10Several you know and continuing
  • 45:12uses out of it and then I had
  • 45:14supports that they want you to be
  • 45:16able to get that continuing use.
  • 45:18What they don't want you to do is to hold
  • 45:20on the onto it for you know a really
  • 45:22long time and never let anyone else.
  • 45:24Get access to it.
  • 45:26So finding the balance between that
  • 45:28is something that I think the NIH
  • 45:29could do a better job of describing
  • 45:31and being much more clear on.
  • 45:33But that is what they're saying
  • 45:34is their intent.
  • 45:35So I think that you're well within
  • 45:37your rights to hold them to that,
  • 45:38both in your plan and how you go
  • 45:41through your entire research project.
  • 45:44And then I also just want to say this
  • 45:46is a bit of an ambitious goal of my own,
  • 45:49but something that I know that the
  • 45:51NIH was hoping for in that lots of
  • 45:53other people who contributed to this
  • 45:54policy and kind of heard about this
  • 45:56policy and have encouraged similar
  • 45:58policies want to see is that the real
  • 46:01goal here is that hopefully eventually
  • 46:03we will start to see data sets.
  • 46:05As a primary research output,
  • 46:06Just as important, just as valuable,
  • 46:09just as prioritize.
  • 46:10Just as held up for promotion
  • 46:13and tenure as publications are.
  • 46:15Because data sets are a ton of work.
  • 46:18I mean,
  • 46:18whether it's the generation of
  • 46:20actually creating and collecting them,
  • 46:21whether it's the analysis that goes into it,
  • 46:23whether it's the cleaning and the
  • 46:26identification that's required
  • 46:27to actually make them usable and
  • 46:29depositable and a new location.
  • 46:31They are a lot of work,
  • 46:33a lot of time, a lot of Labor.
  • 46:36So they deserve to be,
  • 46:37you know,
  • 46:37treated just the same as your publications
  • 46:40are held up very much in the same way.
  • 46:43And that's something that I hope we're
  • 46:44going to see more of in the future,
  • 46:46that they're very much seen as just as
  • 46:48worthy of a product as all the publications
  • 46:51that you hope to get on your CV.
  • 46:53So again,
  • 46:53bit of a rosy outlook there.
  • 46:55I don't know how long it will take
  • 46:56for us to get to that point where
  • 46:58if we'll ever get to that point,
  • 46:59but that is one of the goals.
  • 47:02So trying to see,
  • 47:03just to sum up a little bit,
  • 47:05data management and turning
  • 47:07expectations are increasing not just
  • 47:08at the NIH but at lots of other U.S.
  • 47:11Federal agencies and at lots of
  • 47:12other places around the globe.
  • 47:14So I really don't think this is going away.
  • 47:16If anything this is kind of turning up.
  • 47:19And something to be really
  • 47:21thinking about in your future as a
  • 47:23researcher and as a career academic.
  • 47:27And finally, I hope the main thing that
  • 47:29you take away from this is that there are
  • 47:31a lot of resources available out there.
  • 47:33That's actually when I'm going
  • 47:35to show you next.
  • 47:36I've got all the references here from
  • 47:38things that were mentioned in the slides.
  • 47:41There are quite a lot of resources
  • 47:42at the library.
  • 47:43We have a whole site with lots of
  • 47:45resources that were discussed today,
  • 47:47but many more.
  • 47:48There's an asynchronous e-mail course
  • 47:49that you can take on data management
  • 47:51and how to write data management plans.
  • 47:54It takes about two to three weeks to take.
  • 47:55It's completely asynchronous that just
  • 47:57emails that get delivered to you.
  • 47:59So if you're interested in that,
  • 48:00sign up and then Ioffer lots of trainings
  • 48:04on this and I am available if you
  • 48:06want help with your data management plans.
  • 48:08Once you're kind of towards the final stages,
  • 48:10always happy to do a quick review for you
  • 48:13and give you text on how to potentially
  • 48:15make it better and answer your questions.
  • 48:18Also, these are some more resources
  • 48:20to help you write your plans,
  • 48:22some of which I already covered today.
  • 48:24Here's resources from the Office of
  • 48:27Sponsored Projects about budgeting,
  • 48:29and then finally my e-mail address and
  • 48:31for any questions that I can answer.
  • 48:33Office Sponsored Projects is
  • 48:34there for you to help as well.
  • 48:36They're definitely the ones
  • 48:37you want to go to.
  • 48:38If you have specific questions
  • 48:39about budgeting or some of
  • 48:40the more kind of like policy,
  • 48:42upload questions,
  • 48:43and they're the ones who
  • 48:44will help you with that.
  • 48:46So that is all I have for you all.
  • 48:48I'm going to stop it right here
  • 48:50on my e-mail address and I will
  • 48:52open the floor for any additional
  • 48:55questions that you all have.
  • 48:56And like I said,
  • 48:57I will make sure these slides get to you.
  • 48:59I'm not sure who I need to send them to,
  • 49:00but I'll make sure that
  • 49:02they're available for everyone.
  • 49:17Yeah, Question. Go ahead.
  • 49:19I was wondering, do you know
  • 49:22if they're actually going to
  • 49:24be checking in on whether we
  • 49:26follow up with these plans?
  • 49:28And if so, however in the world,
  • 49:30would that possibly happen?
  • 49:33I can't imagine them, you know,
  • 49:35that would take so much effort and
  • 49:37it would be no fun for anybody,
  • 49:38I would imagine. Yeah. So they.
  • 49:44I have you know,
  • 49:45suggested that they might in the
  • 49:47various webinars and trainings that
  • 49:48they've that they've had about this,
  • 49:51but I think your intuition is spot on.
  • 49:53I mean, how how could they?
  • 49:56I don't think the end of the.
  • 49:58The NH as well enough resource to really
  • 50:01could be doing lots of check up about this.
  • 50:03I mean in a lot of ways you're very
  • 50:05much on the honor system here.
  • 50:07Although I will say that you know
  • 50:09data repositories are definitely
  • 50:10going to check up if you submit to
  • 50:12the most of them have the readers
  • 50:14on staff if you were going to you
  • 50:15know look through your data trying
  • 50:17to make sure that it's actually in
  • 50:18a pretty good state for publishing.
  • 50:20Much in the same way that when
  • 50:21you publish you know a manuscripts
  • 50:23you're usually.
  • 50:24I don't have peer review and
  • 50:25editors looking at things.
  • 50:26So data repositories are really
  • 50:27similar in that in that sense.
  • 50:29But is the NIH going to be, you know,
  • 50:32coming around doing audits on
  • 50:34that kind of stuff?
  • 50:35I'm just not so sure.
  • 50:36But I will say this is a good
  • 50:38opportunity to show you the various
  • 50:40resources the NIH has about this.
  • 50:42These are in the resources
  • 50:43that I just showed you.
  • 50:44But this is sharing.nih.gov It's a
  • 50:47new website that they have produced,
  • 50:48support the policy and they
  • 50:51do actually have a section.
  • 50:53About this,
  • 50:54so I'm going to head over here to writing
  • 50:56that data management and sharing plan.
  • 50:59And down here they have assessment
  • 51:02and this is where they basically
  • 51:04talk about how it's going to be
  • 51:06reviewed and they also talk about
  • 51:08that they are going to expect you to
  • 51:10update on this during your RPP R's,
  • 51:12I think I said that acronym, right?
  • 51:14I hope so that those are the two
  • 51:16things that they've said about
  • 51:18it that they will be.
  • 51:19Reviewing the plan during the pre
  • 51:22award process and that they will
  • 51:23also be expecting you to kind of talk
  • 51:25about this in your updates to the NIH.
  • 51:27But other than that no particular
  • 51:30enforcement has yet been announced.
  • 51:33So we'll see maybe that's coming,
  • 51:35maybe it's not,
  • 51:37but for now it's pretty light touch
  • 51:42other questions.
  • 51:45I will just go back and say
  • 51:47this is a really nice resource.
  • 51:49Some of it is done,
  • 51:52but a lot of questions can definitely
  • 51:54be answered on sharing.nih.gov But
  • 51:57you're always welcome to e-mail me
  • 51:59and I'm happy to kind of point you
  • 52:01directly to the resource that you need.
  • 52:02I feel like I know this website by
  • 52:04the back of my hand at this point.
  • 52:21Question.
  • 52:28Actually I have another question. It's
  • 52:32directly related to to this
  • 52:34particular document, but on the R35
  • 52:36that I just recently submitted,
  • 52:39I had to, I was told at the very
  • 52:41last minute I had to do this thing.
  • 52:43It's called a plan for
  • 52:46enhancing diverse perspectives.
  • 52:47Are you aware of this?
  • 52:49And unlike the data management plan,
  • 52:51I was unable to find any templates.
  • 52:54Or any sort of real sense of
  • 52:56what they were actually after.
  • 52:58And I basically had to write
  • 53:00something that I have no idea if it
  • 53:03was along the right lines or not.
  • 53:07You said it was called the plan
  • 53:09that for enhancing the diverse
  • 53:11perspectives. So it's essentially ADEI plan.
  • 53:16I am not familiar with that,
  • 53:18but I'll do some looking around see if
  • 53:21that's something that maybe I should
  • 53:22be talking about in these books.
  • 53:24Presentations. I also wonder if you'd
  • 53:25be able to get more information
  • 53:27about that, perhaps from Y FM,
  • 53:30the diversity office or the library
  • 53:33also has the diversity office,
  • 53:35but I don't know about that offhand.
  • 53:37I am sorry to say
  • 53:40I didn't either until the
  • 53:41day I submitted my grant,
  • 53:42and then I realized I found out
  • 53:44I had to write this in one day.
  • 53:46What an unwelcome surprise.
  • 53:48That is So frustrating.
  • 53:50That's why I thought I'd mention
  • 53:51it now, just in case other people
  • 53:53might run into the same thing.
  • 53:56Well, I appreciate you bringing that up.
  • 53:57I hope that's a good potential, kind of,
  • 54:00you know, people will know that maybe
  • 54:02they need to look around for that,
  • 54:03and I'll definitely keep my eye
  • 54:04out for resources on that as well.
  • 54:15All right. If there's no other questions,
  • 54:18thank you very much, Caitlin.
  • 54:19We will make the recording in the
  • 54:21slides available up on our website.