WEBVTT

00:00:00.000 --> 00:00:03.255
Now I would like to introduce our great speakers.

00:00:03.255 --> 00:00:08.745
Philip Young is the repository manager at VTechWorks,

00:00:08.745 --> 00:00:10.320
which provides global access

00:00:10.320 --> 00:00:11.955
to Virginia Tech scholarship.

00:00:11.955 --> 00:00:14.430
He also provides outreach to the university on

00:00:14.430 --> 00:00:17.565
open access, ORCID and Perma.cc.

00:00:17.565 --> 00:00:19.920
Jimmy Ghaphery is Associate Dean for

00:00:19.920 --> 00:00:22.110
Scholarly Communications and publishing at

00:00:22.110 --> 00:00:24.510
Virginia Commonwealth University Libraries.

00:00:24.510 --> 00:00:26.175
He and his team coordinate

00:00:26.175 --> 00:00:27.990
library technology operations and

00:00:27.990 --> 00:00:29.745
advocacy for all things open,

00:00:29.745 --> 00:00:32.685
including open access, open repositories,

00:00:32.685 --> 00:00:34.620
open publishing, open educational

00:00:34.620 --> 00:00:36.640
resources, and open data.

00:00:36.640 --> 00:00:39.260
Without further ado, this is their presentation,

00:00:39.260 --> 00:00:41.330
public domain and paywall copyright

00:00:41.330 --> 00:00:42.800
statements on journal articles

00:00:42.800 --> 00:00:45.095
by US government employees.

00:00:45.095 --> 00:00:48.820
Well, thank you for the introduction, Ellyn.

00:00:48.820 --> 00:00:51.920
I'm Jimmy Ghaphery and Philip and I are

00:00:51.920 --> 00:00:55.010
thrilled to be here at the Kraemer Copyright Conference.

00:00:55.010 --> 00:00:58.190
The sessions have been fantastic so far,

00:00:58.190 --> 00:01:00.290
so much of what I've been able to

00:01:00.290 --> 00:01:04.320
attend is really builds on what

00:01:04.320 --> 00:01:06.330
we want to talk about from

00:01:06.330 --> 00:01:10.010
Sandra and Kyle's session that I was able to

00:01:10.010 --> 00:01:12.740
go to talking about government works, to

00:01:12.740 --> 00:01:16.040
Pia's excellent keynote about author's rights.

00:01:16.040 --> 00:01:20.990
I just sat in on Kevin Smith and Kenneth Crews' session,

00:01:20.990 --> 00:01:22.550
which really goes to the heart of

00:01:22.550 --> 00:01:24.110
a lot of what we're talking about.

00:01:24.110 --> 00:01:28.395
Anyway, happy to be here and next slide.

00:01:28.395 --> 00:01:31.280
What we hope to do today is to give

00:01:31.280 --> 00:01:33.020
some brief background on

00:01:33.020 --> 00:01:35.990
the copyright status of US government authored works.

00:01:35.990 --> 00:01:38.750
I'll share some preliminary research results

00:01:38.750 --> 00:01:40.190
of a study we've been doing

00:01:40.190 --> 00:01:41.450
about how these works are

00:01:41.450 --> 00:01:45.140
labeled by commercial journal publishers.

00:01:45.140 --> 00:01:47.120
Talk about the implications of

00:01:47.120 --> 00:01:49.475
this research for libraries and access.

00:01:49.475 --> 00:01:53.240
Most importantly, we're thrilled to be talking to a group

00:01:53.240 --> 00:01:55.760
of librarians and copyright experts

00:01:55.760 --> 00:01:57.965
and we look forward to getting your feedback.

00:01:57.965 --> 00:02:02.230
For background and this is as much to let you all

00:02:02.230 --> 00:02:06.070
know that Philip and I are not copyright lawyers,

00:02:06.070 --> 00:02:09.550
but just to share some of our understanding of

00:02:09.550 --> 00:02:11.560
copyright law is that

00:02:11.560 --> 00:02:13.420
copyright protection is not

00:02:13.420 --> 00:02:15.535
available for any work of the US government.

00:02:15.535 --> 00:02:19.565
This is for 17 USC 105.

00:02:19.565 --> 00:02:26.470
The US government work is defined in USC 101

00:02:26.470 --> 00:02:29.860
as a work that's prepared by an officer or employee

00:02:29.860 --> 00:02:31.375
of the United States government

00:02:31.375 --> 00:02:33.505
as part of their official duties.

00:02:33.505 --> 00:02:36.925
It's interesting looking at the legislative history,

00:02:36.925 --> 00:02:40.270
that this doesn't prevent

00:02:40.270 --> 00:02:41.780
a government employee from securing

00:02:41.780 --> 00:02:44.580
copyright even in their field of expertise,

00:02:44.580 --> 00:02:48.845
if this isn't part of their official duties.

00:02:48.845 --> 00:02:51.050
Backup one more slide.

00:02:51.050 --> 00:02:58.380
The last piece that we want to mention is 17 US 103,

00:02:58.690 --> 00:03:03.790
where there is no requirement for any copyright labeling.

00:03:03.790 --> 00:03:05.540
It's interesting to note that

00:03:05.540 --> 00:03:09.850
this only dates back to 1989.

00:03:09.850 --> 00:03:12.330
Before then it was a requirement to have

00:03:12.330 --> 00:03:13.735
the copyright label and as

00:03:13.735 --> 00:03:16.505
Kenneth Crews said, it was very draconian.

00:03:16.505 --> 00:03:19.920
He just said that in the past presentation.

00:03:19.920 --> 00:03:25.160
This included the requirement to state whether part

00:03:25.160 --> 00:03:30.570
of the work was part of the US government employees,

00:03:30.570 --> 00:03:32.210
and if that was mislabeled,

00:03:32.210 --> 00:03:35.795
that would throw the copyright completely out the window.

00:03:35.795 --> 00:03:39.485
The other interesting thing about US government works,

00:03:39.485 --> 00:03:40.910
and you'll often see this on

00:03:40.910 --> 00:03:45.390
the copyright labeling, is foreign protection.

00:03:45.390 --> 00:03:48.860
Even though these works are in the public domain,

00:03:48.860 --> 00:03:50.569
in the United States,

00:03:50.569 --> 00:03:53.299
the United States government is reserving

00:03:53.299 --> 00:03:56.945
the right to assert copyright overseas.

00:03:56.945 --> 00:04:00.290
This is seen when you look at

00:04:00.290 --> 00:04:05.105
the legislative history from the US House report,

00:04:05.105 --> 00:04:07.940
that there are no valid reasons for

00:04:07.940 --> 00:04:11.615
denying such protection to the United States government.

00:04:11.615 --> 00:04:14.240
It's interesting to think of this as

00:04:14.240 --> 00:04:17.360
merely theoretical that the US government

00:04:17.360 --> 00:04:21.290
would assert the copyright

00:04:21.290 --> 00:04:23.440
of these works overseas or not.

00:04:23.440 --> 00:04:26.060
But we had a great email conversation

00:04:26.060 --> 00:04:27.290
with Peter Hirtle with

00:04:27.290 --> 00:04:31.160
this question and he helped us in our thinking on it.

00:04:31.160 --> 00:04:33.650
He noted as an exercise,

00:04:33.650 --> 00:04:37.445
he looked at the copyright system in Canada,

00:04:37.445 --> 00:04:40.520
in which he found that the US government has actually

00:04:40.520 --> 00:04:42.290
registered a few copyrights for

00:04:42.290 --> 00:04:44.615
CDC publications in Canada.

00:04:44.615 --> 00:04:48.845
One of the best resources we found about

00:04:48.845 --> 00:04:53.410
US government works and copyright are the CENDI FAQs.

00:04:53.410 --> 00:04:57.065
CENDI is an interagency organization.

00:04:57.065 --> 00:04:59.750
It's just a really nice overview.

00:04:59.750 --> 00:05:01.850
It explains the various exceptions,

00:05:01.850 --> 00:05:05.210
including specific agencies that are exempt from this,

00:05:05.210 --> 00:05:09.845
as well as how this all fits in with contractors.

00:05:09.845 --> 00:05:11.825
The one thing that was

00:05:11.825 --> 00:05:13.670
the most reassuring to Philip and me,

00:05:13.670 --> 00:05:16.595
we had lots of questions about co-authorship.

00:05:16.595 --> 00:05:18.910
What happens if you're in

00:05:18.910 --> 00:05:20.900
a co-authorship situation where

00:05:20.900 --> 00:05:23.330
one author is from the US government and one

00:05:23.330 --> 00:05:26.305
is from private industry or another university?

00:05:26.305 --> 00:05:29.510
This remains confusing to us to

00:05:29.510 --> 00:05:33.245
some degree as does I think co-authorship in general.

00:05:33.245 --> 00:05:34.520
It was reassuring that

00:05:34.520 --> 00:05:38.510
the CENDI site had notes that much of this is

00:05:38.510 --> 00:05:42.740
unsettled in courts and

00:05:42.740 --> 00:05:45.515
in such situations you should talk to a real lawyer.

00:05:45.515 --> 00:05:49.460
The last thing I'll note on the CENDI FAQ that's

00:05:49.460 --> 00:05:51.710
posted is their

00:05:51.710 --> 00:05:53.720
actual copyright notice that they include,

00:05:53.720 --> 00:05:55.850
which is similar to some of the notices we've

00:05:55.850 --> 00:05:59.135
seen where they say this is a work of the US government.

00:05:59.135 --> 00:06:01.220
It's not subject to copyright protection

00:06:01.220 --> 00:06:02.615
in the United States,

00:06:02.615 --> 00:06:04.610
foreign copyrights may apply.

00:06:04.610 --> 00:06:07.550
It was interesting to us that even on this FAQ,

00:06:07.550 --> 00:06:09.440
they asserted that possibility

00:06:09.440 --> 00:06:12.650
for foreign copyright. Philip.

00:06:12.650 --> 00:06:15.155
Thanks Jimmy.

00:06:15.155 --> 00:06:19.160
I want to talk a little bit about our research prompts.

00:06:19.160 --> 00:06:22.100
As a repository manager

00:06:22.100 --> 00:06:23.240
on part of my job is to getting

00:06:23.240 --> 00:06:25.190
content into the repository.

00:06:25.190 --> 00:06:27.260
When I see articles that are

00:06:27.260 --> 00:06:28.910
marked public domain in the United States,

00:06:28.910 --> 00:06:30.560
I feel relatively comfortable that I can go

00:06:30.560 --> 00:06:32.615
into the institutional repository,

00:06:32.615 --> 00:06:34.760
and have the articles of

00:06:34.760 --> 00:06:37.280
Virginia Tech authors in there and that's great.

00:06:37.280 --> 00:06:40.490
Then I noticed that there's lots of other articles by

00:06:40.490 --> 00:06:42.110
government authors that aren't

00:06:42.110 --> 00:06:44.585
marked public domain in the United States.

00:06:44.585 --> 00:06:47.445
Is it still okay to put them in the IR?

00:06:47.445 --> 00:06:49.400
Or are we depending on

00:06:49.400 --> 00:06:52.085
publishers to correctly mark these?

00:06:52.085 --> 00:06:54.530
What's the risk of putting these articles

00:06:54.530 --> 00:06:56.950
in the institutional repository?

00:06:56.950 --> 00:06:58.920
Am I going to hear from publishers?

00:06:58.920 --> 00:07:01.545
Am I going to get a takedown request?

00:07:01.545 --> 00:07:03.840
In 2019 I posted to

00:07:03.840 --> 00:07:07.455
the IR Managers Google group about this issue.

00:07:07.455 --> 00:07:11.200
The general response was confusion.

00:07:11.200 --> 00:07:13.745
I remember two of their responses.

00:07:13.745 --> 00:07:16.820
One of them was from another manager who said,

00:07:16.820 --> 00:07:18.320
"Yes, we are doing this and we've

00:07:18.320 --> 00:07:21.089
been doing it for quite a while."

00:07:21.860 --> 00:07:26.270
The other response was from a manager of

00:07:26.270 --> 00:07:29.690
a repository at a government agency, he said, "Yes,

00:07:29.690 --> 00:07:31.190
I'm very interested in knowing

00:07:31.190 --> 00:07:33.080
the answer to this because I have

00:07:33.080 --> 00:07:34.550
lots of authors writing

00:07:34.550 --> 00:07:37.620
articles and I'd like to put them into my repository."

00:07:38.510 --> 00:07:41.915
We decided to collect some data.

00:07:41.915 --> 00:07:46.565
How are publishers currently labeling these articles?

00:07:46.565 --> 00:07:48.980
When they label these articles

00:07:48.980 --> 00:07:50.450
or if they label these articles,

00:07:50.450 --> 00:07:53.030
is access provided when they're

00:07:53.030 --> 00:07:57.280
public domain in the United States?

00:07:58.010 --> 00:08:01.200
How do publishers differ on labeling,

00:08:01.200 --> 00:08:02.950
is one publisher always

00:08:02.950 --> 00:08:05.420
labeling and another publisher never labeling?

00:08:05.420 --> 00:08:08.550
But what if all of the authors are government employees?

00:08:08.550 --> 00:08:13.290
That would throw out the joint works confusion there.

00:08:13.290 --> 00:08:15.850
There also, we wanted to look at trends

00:08:15.850 --> 00:08:18.505
in a type of copyright assertion,

00:08:18.505 --> 00:08:20.560
as well as disciplinary differences,

00:08:20.560 --> 00:08:22.600
which we'll see in just a moment.

00:08:22.600 --> 00:08:24.970
We're not the only ones who have looked at this.

00:08:24.970 --> 00:08:26.740
Others have looked at the issue of

00:08:26.740 --> 00:08:29.065
copyright marking practices before.

00:08:29.065 --> 00:08:32.140
For example, Paul Royster, in commenting on NISO's

00:08:32.140 --> 00:08:35.845
proposed open access metadata standards in 2014,

00:08:35.845 --> 00:08:37.780
provided eight examples of

00:08:37.780 --> 00:08:40.725
incorrect copyright assignment in scholarly journals,

00:08:40.725 --> 00:08:44.045
all of which dealt with US government works.

00:08:44.045 --> 00:08:46.835
April Hathcock in 2016,

00:08:46.835 --> 00:08:49.055
noted conflicting copyright statements

00:08:49.055 --> 00:08:50.690
on the same work as

00:08:50.690 --> 00:08:51.920
part of her critique of

00:08:51.920 --> 00:08:55.529
the ethical dimensions of copyright and reuse.

00:08:56.000 --> 00:08:58.970
We collected two datasets,

00:08:58.970 --> 00:09:01.595
one from the National Cancer Institute

00:09:01.595 --> 00:09:04.790
and the other from the US Department of Agriculture.

00:09:04.790 --> 00:09:08.870
Both datasets from the calendar year 2019.

00:09:08.870 --> 00:09:12.549
We derive this from the Web of Science.

00:09:12.549 --> 00:09:15.765
These articles do include coauthors.

00:09:15.765 --> 00:09:19.310
This search retrieves any article that

00:09:19.310 --> 00:09:23.620
has a government author on it from those two agencies.

00:09:23.620 --> 00:09:27.589
You can see our search strategy at the bottom.

00:09:27.589 --> 00:09:32.735
We use the affiliation data for the two agencies.

00:09:32.735 --> 00:09:35.810
Then we filtered by

00:09:35.810 --> 00:09:38.090
the article type and then

00:09:38.090 --> 00:09:41.180
refined by the publication year 2019.

00:09:41.180 --> 00:09:43.415
For the National Cancer Institute,

00:09:43.415 --> 00:09:46.790
this resulted in 2,378 articles issued in

00:09:46.790 --> 00:09:51.600
2019 and for USDA, over 6,000 articles.

00:09:52.980 --> 00:09:56.800
These are large datasets and we wanted to do

00:09:56.800 --> 00:10:00.580
a random sample to look at these manually.

00:10:00.580 --> 00:10:03.310
To get a 95 percent confidence level

00:10:03.310 --> 00:10:05.440
and five percent margin of error,

00:10:05.440 --> 00:10:12.040
we got the resulting random samples of 318 articles for

00:10:12.040 --> 00:10:16.930
the NCI dataset and 362 articles for

00:10:16.930 --> 00:10:22.000
the USDA dataset and we each looked at these manually,

00:10:22.000 --> 00:10:26.890
Jimmy taking the NCI dataset and I took the USDA dataset.

00:10:26.890 --> 00:10:31.150
We decided to look for the copyright statement on the PDF

00:10:31.150 --> 00:10:33.865
and so we were always looking at the PDF

00:10:33.865 --> 00:10:36.610
for whatever copyright statement was there.

00:10:36.610 --> 00:10:39.130
That's frequently referred to as the version of

00:10:39.130 --> 00:10:42.010
record and of course it's portable and

00:10:42.010 --> 00:10:44.740
gets passed around from person to person and so it's

00:10:44.740 --> 00:10:46.510
important to have a good copyright statement

00:10:46.510 --> 00:10:48.505
on that document.

00:10:48.505 --> 00:10:53.230
Of course, we are both fortunate to be at

00:10:53.230 --> 00:10:55.960
a major research institutions where we have

00:10:55.960 --> 00:11:00.130
access to Web of Science as well as the PDFs.

00:11:00.130 --> 00:11:01.915
One thing I should mention is

00:11:01.915 --> 00:11:03.610
that we would have preferred to use

00:11:03.610 --> 00:11:05.590
a non-proprietary source for

00:11:05.590 --> 00:11:08.515
the bibliographic information such as Lens.

00:11:08.515 --> 00:11:10.720
I did do this search for

00:11:10.720 --> 00:11:13.360
USDA on the Lens site and only retrieved

00:11:13.360 --> 00:11:15.385
about a third of the articles

00:11:15.385 --> 00:11:18.460
and so we decided to go with Web of Science instead.

00:11:18.460 --> 00:11:20.470
Both Jimmy and I always

00:11:20.470 --> 00:11:22.780
found that each of the articles did

00:11:22.780 --> 00:11:24.580
have a government author on there so we feel

00:11:24.580 --> 00:11:27.770
like the accuracy was quite good.

00:11:28.470 --> 00:11:31.030
When we manually looked at these articles,

00:11:31.030 --> 00:11:33.820
we put them into one of five designations.

00:11:33.820 --> 00:11:36.280
The first was copyright statement missing,

00:11:36.280 --> 00:11:37.600
we could not find a copyright

00:11:37.600 --> 00:11:39.475
statement anywhere on the PDF.

00:11:39.475 --> 00:11:41.875
The second was a generic statement,

00:11:41.875 --> 00:11:44.199
something along the lines of copyright,

00:11:44.199 --> 00:11:47.245
Wiley 2019 something like that.

00:11:47.245 --> 00:11:49.870
Then we collapsed all the Creative Commons

00:11:49.870 --> 00:11:52.945
licenses into one Creative Commons category.

00:11:52.945 --> 00:11:57.280
Then of course, we want to look at the ones that

00:11:57.280 --> 00:12:01.705
did recognize Public Domain in the US on the article.

00:12:01.705 --> 00:12:04.060
Finally, we had a category

00:12:04.060 --> 00:12:08.350
for CC0, public domain dedication.

00:12:08.350 --> 00:12:10.780
In a few cases there were

00:12:10.780 --> 00:12:13.945
more than one copyright statement on the article.

00:12:13.945 --> 00:12:16.840
For example, there were a few articles that said

00:12:16.840 --> 00:12:20.460
public domain in the United States and CC-BY.

00:12:20.460 --> 00:12:23.340
When that happened, we always put the article in

00:12:23.340 --> 00:12:28.670
the public domain category. Jimmy.

00:12:29.160 --> 00:12:35.800
This is what we found and looking across these samples,

00:12:35.800 --> 00:12:39.985
roughly combined with the NCI and the USDA,

00:12:39.985 --> 00:12:43.450
13 percent actually recognize

00:12:43.450 --> 00:12:45.980
the US government authorship.

00:12:46.410 --> 00:12:50.785
More interestingly, across both sets,

00:12:50.785 --> 00:12:52.180
I don't know if it's more interesting,

00:12:52.180 --> 00:12:57.235
but 28 percent use the Creative Commons.

00:12:57.235 --> 00:13:00.715
Almost half of the articles in the sample

00:13:00.715 --> 00:13:04.375
had a generic copyright statement and that would be

00:13:04.375 --> 00:13:07.300
no recognition at all that there was

00:13:07.300 --> 00:13:12.130
a US government author and very

00:13:12.130 --> 00:13:17.785
few with a CC0 Public Domain dedication.

00:13:17.785 --> 00:13:20.920
We're going to talk later

00:13:20.920 --> 00:13:25.120
about the ones we found that had no copyright statement.

00:13:25.120 --> 00:13:27.820
The other thing that I'll just mention

00:13:27.820 --> 00:13:29.650
is that it was very

00:13:29.650 --> 00:13:34.719
interesting to us that with disciplinary differences,

00:13:34.719 --> 00:13:39.010
that these sets were pretty similar in terms of

00:13:39.010 --> 00:13:41.530
copyright labeling practices for government

00:13:41.530 --> 00:13:45.280
authored works and we found that somewhat surprising.

00:13:45.280 --> 00:13:49.270
Some examples of copyright assertions

00:13:49.270 --> 00:13:53.050
where there were clearly governmental authors.

00:13:53.050 --> 00:13:57.040
Here's the first one where the lead author

00:13:57.040 --> 00:13:59.050
was from the National Cancer Institute and

00:13:59.050 --> 00:14:01.780
four of the six authors were from the NCI.

00:14:01.780 --> 00:14:03.805
That simply says copyright,

00:14:03.805 --> 00:14:05.650
American Society for Clinical

00:14:05.650 --> 00:14:08.420
Oncology, all rights reserved.

00:14:08.600 --> 00:14:12.795
Likewise, the second example,

00:14:12.795 --> 00:14:15.975
only two co-authors from the NCI,

00:14:15.975 --> 00:14:20.190
no recognition of government authorship and pretty

00:14:20.190 --> 00:14:23.215
severe copyright statement that

00:14:23.215 --> 00:14:26.035
the society reserves all rights.

00:14:26.035 --> 00:14:30.220
The third one, an example where NCI was

00:14:30.220 --> 00:14:33.820
the lead author and four of seven authors

00:14:33.820 --> 00:14:37.765
were from NCI and Springer Nature simply says yes,

00:14:37.765 --> 00:14:39.700
we got the copyright here.

00:14:39.700 --> 00:14:43.270
Likewise, we saw the same type of thing for

00:14:43.270 --> 00:14:47.689
USDA and not much disciplinary difference

00:14:49.080 --> 00:14:54.940
whether it's all USDA co-authors or clearly

00:14:54.940 --> 00:14:59.380
some significant contribution from the USDA and this was

00:14:59.380 --> 00:15:04.480
across all of societies and publishers.

00:15:04.480 --> 00:15:06.760
The examples of government authorship

00:15:06.760 --> 00:15:09.130
were also really interesting.

00:15:09.130 --> 00:15:11.740
There are some similarities here,

00:15:11.740 --> 00:15:14.650
but bolded it's interesting what

00:15:14.650 --> 00:15:19.550
the different publishers are considering is copyrighted.

00:15:19.680 --> 00:15:22.435
Is it the work and its text?

00:15:22.435 --> 00:15:24.565
Is it just the article?

00:15:24.565 --> 00:15:29.815
There's also the copyright symbol on some of these.

00:15:29.815 --> 00:15:32.830
It was an even variety

00:15:32.830 --> 00:15:34.060
as you can see in the middle of

00:15:34.060 --> 00:15:36.700
this chart, within Wiley.

00:15:36.700 --> 00:15:41.725
A very interesting example from Taylor and Francis,

00:15:41.725 --> 00:15:47.110
where they have the same boilerplate information at

00:15:47.110 --> 00:15:49.510
the top of the copyright statement that the work was

00:15:49.510 --> 00:15:53.065
offered as part of the contributors official duties, etc.

00:15:53.065 --> 00:15:59.560
But then below, different open access addendums,

00:15:59.560 --> 00:16:02.515
I would call them probably incorrectly.

00:16:02.515 --> 00:16:05.080
But one that has

00:16:05.080 --> 00:16:07.870
OA Creative Commons public domain mark

00:16:07.870 --> 00:16:11.305
and the other that lists Creative Commons,

00:16:11.305 --> 00:16:13.900
non-commercial no derivatives license.

00:16:13.900 --> 00:16:15.580
This is interesting in a couple of

00:16:15.580 --> 00:16:20.080
ways since the article is in the public domain in the US,

00:16:20.080 --> 00:16:22.360
how that could conflict with

00:16:22.360 --> 00:16:25.270
a non-commercial no derivatives licensed and

00:16:25.270 --> 00:16:29.110
also since these Creative Commons licenses

00:16:29.110 --> 00:16:30.729
are international licenses,

00:16:30.729 --> 00:16:32.500
how that conflicts with

00:16:32.500 --> 00:16:34.180
the US government reserving

00:16:34.180 --> 00:16:36.160
the right to assert copyright.

00:16:36.160 --> 00:16:39.820
Were the NCI authors in a position to even

00:16:39.820 --> 00:16:45.320
agree to International Creative Commons license?

00:16:45.900 --> 00:16:48.745
We also wanted to look at

00:16:48.745 --> 00:16:50.905
the Elsevier articles which

00:16:50.905 --> 00:16:53.185
was the largest by any publisher.

00:16:53.185 --> 00:16:56.785
As you can see on the graph on the far left,

00:16:56.785 --> 00:17:00.939
almost a quarter of the sample was published by Elsevier.

00:17:00.939 --> 00:17:02.710
But there are a couple of other things

00:17:02.710 --> 00:17:03.730
that caught our eye that

00:17:03.730 --> 00:17:06.160
I've circled on this graph

00:17:06.160 --> 00:17:08.920
and the first one is

00:17:08.920 --> 00:17:11.050
that of the articles

00:17:11.050 --> 00:17:13.450
that were missing a copyright statement,

00:17:13.450 --> 00:17:18.054
a large number of them were from Elsevier,

00:17:18.054 --> 00:17:20.455
in fact 75 percent.

00:17:20.455 --> 00:17:22.975
We found it unusual that

00:17:22.975 --> 00:17:25.915
the largest publisher of academic journals in the world

00:17:25.915 --> 00:17:27.160
was simply not putting

00:17:27.160 --> 00:17:31.675
any copyright statement at all on their articles.

00:17:31.675 --> 00:17:36.670
The other thing was that almost none,

00:17:36.670 --> 00:17:39.580
in fact just one of the Elsevier articles

00:17:39.580 --> 00:17:40.870
and I'll just say that the articles in

00:17:40.870 --> 00:17:42.985
the public domain in the United States.

00:17:42.985 --> 00:17:45.070
Between these two things,

00:17:45.070 --> 00:17:47.260
we thought it was quite egregious for

00:17:47.260 --> 00:17:49.750
the world's largest publisher

00:17:49.750 --> 00:17:55.310
to do such a poor job of labeling these articles.

00:17:56.520 --> 00:17:59.350
Of these missing articles,

00:17:59.350 --> 00:18:00.610
a large number had something

00:18:00.610 --> 00:18:02.275
like published by Elsevier at the bottom,

00:18:02.275 --> 00:18:05.260
but nothing in the way of the copyright statement.

00:18:05.260 --> 00:18:08.170
And we realize that there's no legal requirement to do so,

00:18:08.170 --> 00:18:11.830
but we are suggesting that it's a good practice in

00:18:11.830 --> 00:18:13.090
scholarly communication to have

00:18:13.090 --> 00:18:16.255
a copyright statement on the PDF.

00:18:16.255 --> 00:18:20.170
This doesn't encourage scholarly sharing

00:18:20.170 --> 00:18:22.705
or use in the least.

00:18:22.705 --> 00:18:24.850
Thirteen of these articles

00:18:24.850 --> 00:18:27.370
have sole affiliations from the USDA,

00:18:27.370 --> 00:18:30.205
and so again, nothing

00:18:30.205 --> 00:18:34.460
acknowledging the public domain whatsoever.

00:18:34.970 --> 00:18:38.415
We also wanted to look at the open-access publishers.

00:18:38.415 --> 00:18:41.130
One of the things that caught my eye in the USDA

00:18:41.130 --> 00:18:44.765
set were the PLoS journals.

00:18:44.765 --> 00:18:48.430
Almost all of them except for

00:18:48.430 --> 00:18:53.290
one had a CC0 public domain dedication.

00:18:53.290 --> 00:18:56.320
I emailed PLoS about this and one of

00:18:56.320 --> 00:18:58.630
their editorial managers emailed me back and said,

00:18:58.630 --> 00:19:02.455
''Yes indeed, when there's a US government author on one of our articles,

00:19:02.455 --> 00:19:07.495
we assign the CC0 public domain dedication."

00:19:07.495 --> 00:19:09.880
The numbers in parens here are the numbers in

00:19:09.880 --> 00:19:12.910
the combined sample datasets.

00:19:12.910 --> 00:19:17.035
We had 18 articles from PLoS between NCI and USDA.

00:19:17.035 --> 00:19:19.060
I also noticed the CC0 from

00:19:19.060 --> 00:19:22.255
PeerJ and emailed them as well,

00:19:22.255 --> 00:19:25.614
and they said it was always author choice.

00:19:25.614 --> 00:19:27.970
We only had two articles by them,

00:19:27.970 --> 00:19:32.695
but one set of authors chose CC0 and the other one CC BY.

00:19:32.695 --> 00:19:36.820
The other large open access publishers, MDPI, Frontiers,

00:19:36.820 --> 00:19:41.410
and BioMed Central use their standard CC BY license.

00:19:41.410 --> 00:19:43.090
But it's interesting that there's

00:19:43.090 --> 00:19:45.250
no acknowledgment from any of

00:19:45.250 --> 00:19:47.530
these publishers really that these

00:19:47.530 --> 00:19:48.700
are works that are in the public

00:19:48.700 --> 00:19:50.095
domain in the United States.

00:19:50.095 --> 00:19:53.710
At least PLoS is recognizing it by assigning

00:19:53.710 --> 00:19:58.580
an international license. Jimmy.

00:19:59.880 --> 00:20:02.500
It's also interesting as we

00:20:02.500 --> 00:20:05.170
think about disciplinary differences.

00:20:05.170 --> 00:20:09.415
When we back up and look at the entire data set,

00:20:09.415 --> 00:20:11.170
not just the sample,

00:20:11.170 --> 00:20:19.900
the full 8,523 articles that were published in 2019,

00:20:19.900 --> 00:20:22.795
where we found very little differences

00:20:22.795 --> 00:20:25.150
of copyright labeling practices.

00:20:25.150 --> 00:20:26.890
These articles did have

00:20:26.890 --> 00:20:33.535
a very different open access portrait.

00:20:33.535 --> 00:20:37.660
Whereas 21 percent of

00:20:37.660 --> 00:20:41.170
the articles from the NCI were paywalled,

00:20:41.170 --> 00:20:46.360
not open access. Twice as many,

00:20:46.360 --> 00:20:51.115
43 percent from the USDA were paywalled.

00:20:51.115 --> 00:20:53.950
We think this is largely due to

00:20:53.950 --> 00:20:58.300
the NIH mandate for deposit and

00:20:58.300 --> 00:21:00.220
our definition of being

00:21:00.220 --> 00:21:03.100
paywalled versus open access

00:21:03.100 --> 00:21:05.515
comes from the Web of Science,

00:21:05.515 --> 00:21:12.190
Our Research classification that is there.

00:21:12.190 --> 00:21:17.320
Combined in 2019 of these US offered works,

00:21:17.320 --> 00:21:22.870
more than a third were not available on open access.

00:21:22.870 --> 00:21:26.260
This gets even more interesting as we

00:21:26.260 --> 00:21:29.755
look historically at what

00:21:29.755 --> 00:21:34.780
a large chunk of literature is bottled

00:21:34.780 --> 00:21:39.805
up behind paywalls that should be in the public domain.

00:21:39.805 --> 00:21:41.860
We're talking about two pretty

00:21:41.860 --> 00:21:43.720
big US government agencies,

00:21:43.720 --> 00:21:46.660
but we're not talking about all of them.

00:21:46.660 --> 00:21:52.615
Between the NCI and the USDA over 50 years,

00:21:52.615 --> 00:21:57.415
almost 200,000 articles are behind paywalls.

00:21:57.415 --> 00:21:59.890
Not surprisingly, we see

00:21:59.890 --> 00:22:06.850
a larger percentage back in the '70s and '80s,

00:22:06.850 --> 00:22:11.380
before a lot of open access advocacy.

00:22:11.380 --> 00:22:14.665
We also wanted to look at the OA status

00:22:14.665 --> 00:22:18.445
of the articles or the public domain statement on them.

00:22:18.445 --> 00:22:20.260
From the USDA set,

00:22:20.260 --> 00:22:23.800
there was 51 articles or 14 percent of the sample set,

00:22:23.800 --> 00:22:27.040
and for NCI it was 35 of

00:22:27.040 --> 00:22:30.550
the articles in that set or 11 percent.

00:22:30.550 --> 00:22:32.950
We use the Unpaywall simple

00:22:32.950 --> 00:22:34.990
query tool and we're very grateful to

00:22:34.990 --> 00:22:37.810
Unpaywall where you can get

00:22:37.810 --> 00:22:41.590
a set of DOIs and enter them into that web tool,

00:22:41.590 --> 00:22:44.425
and they email the spreadsheet

00:22:44.425 --> 00:22:46.360
of the OA status for those works.

00:22:46.360 --> 00:22:50.470
You can see in the table below that over half,

00:22:50.470 --> 00:22:52.435
now I'm on the far right,

00:22:52.435 --> 00:22:55.310
the published version was open.

00:22:56.100 --> 00:22:58.870
But a few had

00:22:58.870 --> 00:23:03.070
the accepted version was the only one that was open.

00:23:03.070 --> 00:23:05.335
Then for submitted version or pre-prints,

00:23:05.335 --> 00:23:07.720
the only two came from the NCI dataset.

00:23:07.720 --> 00:23:10.030
But the category that really interested me the

00:23:10.030 --> 00:23:12.040
most as a repository manager is

00:23:12.040 --> 00:23:13.450
the column that says

00:23:13.450 --> 00:23:17.290
no OA version and so they couldn't find any OA version,

00:23:17.290 --> 00:23:20.590
18 articles in the USDA sample set and

00:23:20.590 --> 00:23:24.175
five in the NCI sample set.

00:23:24.175 --> 00:23:27.340
From the perspective of a repository manager,

00:23:27.340 --> 00:23:29.920
if you can provide the only OA copy,

00:23:29.920 --> 00:23:31.510
then you're probably going to get a lot

00:23:31.510 --> 00:23:34.340
of views and downloads.

00:23:34.770 --> 00:23:37.750
We also wanted to look at cases

00:23:37.750 --> 00:23:41.064
where all of the authors are US government employees.

00:23:41.064 --> 00:23:43.960
We did this for the USDA's sample,

00:23:43.960 --> 00:23:49.735
which again was 362 articles out of the larger dataset.

00:23:49.735 --> 00:23:53.395
We found that 35 of them or 10 percent

00:23:53.395 --> 00:23:56.845
were authored only by USDA authors.

00:23:56.845 --> 00:24:00.850
10 of those were marked as the public domain in the US,

00:24:00.850 --> 00:24:05.260
23 were paywalled, or

00:24:05.260 --> 00:24:07.510
about two-thirds and that

00:24:07.510 --> 00:24:12.140
included six that were marked public domain in the US.

00:24:12.360 --> 00:24:15.160
We can see a divide here between

00:24:15.160 --> 00:24:17.470
the licensing and the access,

00:24:17.470 --> 00:24:19.990
where if you see a work that's in the public domain

00:24:19.990 --> 00:24:21.370
in the United States, and you can't get

00:24:21.370 --> 00:24:24.025
access to it, it's rather frustrating.

00:24:24.025 --> 00:24:26.875
That's what we see on the screenshot.

00:24:26.875 --> 00:24:29.425
This is from a Wiley journal.

00:24:29.425 --> 00:24:31.720
Although we were looking at PDFs,

00:24:31.720 --> 00:24:35.275
this one is clearly marked on the website as well.

00:24:35.275 --> 00:24:37.270
This article has been contributed to by

00:24:37.270 --> 00:24:38.860
US government employees and their work

00:24:38.860 --> 00:24:41.290
is in the public domain in the USA.

00:24:41.290 --> 00:24:44.529
If you go to click the little PDF icon,

00:24:44.529 --> 00:24:47.455
then you get this. You don't have access.

00:24:47.455 --> 00:24:49.480
You need to log in to an institution

00:24:49.480 --> 00:24:53.030
or pay a fee to get access to that article.

00:24:53.670 --> 00:24:56.230
One of the other interesting things that we

00:24:56.230 --> 00:24:58.195
found in this research was that

00:24:58.195 --> 00:25:00.130
the US Forest Service has

00:25:00.130 --> 00:25:01.840
a repository where they are

00:25:01.840 --> 00:25:04.255
actively taking these articles,

00:25:04.255 --> 00:25:06.520
whether they have a statement on them or not,

00:25:06.520 --> 00:25:07.945
whether they're open or not,

00:25:07.945 --> 00:25:10.510
and putting them in their repository,

00:25:10.510 --> 00:25:13.135
it's called Treesearch.

00:25:13.135 --> 00:25:15.010
They say that they are taking

00:25:15.010 --> 00:25:16.870
articles published by the agency,

00:25:16.870 --> 00:25:18.490
as well as those published by others,

00:25:18.490 --> 00:25:20.530
including papers appearing journals,

00:25:20.530 --> 00:25:22.405
conference proceedings, or books.

00:25:22.405 --> 00:25:25.000
And they take a much stronger stand than CENDI does

00:25:25.000 --> 00:25:28.045
on the public domain status of these works.

00:25:28.045 --> 00:25:31.000
"US copyright regulations require that

00:25:31.000 --> 00:25:33.790
publications by the federal government or authored

00:25:33.790 --> 00:25:35.800
by federal employees must remain in

00:25:35.800 --> 00:25:40.190
the public domain." Jimmy.

00:25:42.210 --> 00:25:47.275
An interesting piece on the NCI data set,

00:25:47.275 --> 00:25:52.345
because of the disciplinary nature

00:25:52.345 --> 00:25:55.360
of biomedical research,

00:25:55.360 --> 00:26:00.715
is it really shows the co-author confusion.

00:26:00.715 --> 00:26:04.480
A number of many of the articles in the data set had

00:26:04.480 --> 00:26:08.424
many different co-authors and this is just one example,

00:26:08.424 --> 00:26:12.235
but it was not the example with the most co-authors,

00:26:12.235 --> 00:26:16.465
but here's one article that has 35 different co-authors,

00:26:16.465 --> 00:26:20.500
two of which work at NCI.

00:26:20.500 --> 00:26:22.870
In those 35 co-authors were

00:26:22.870 --> 00:26:25.285
talking about nine different countries,

00:26:25.285 --> 00:26:27.355
a mix of universities,

00:26:27.355 --> 00:26:31.855
governmental or non-profit, employers.

00:26:31.855 --> 00:26:34.510
The copyright statement on

00:26:34.510 --> 00:26:37.060
this article is, copyright the authors,

00:26:37.060 --> 00:26:38.530
under exclusive license to

00:26:38.530 --> 00:26:43.945
Springer Nature America, Incorporated, 2019.

00:26:43.945 --> 00:26:48.310
The attribution statement of

00:26:48.310 --> 00:26:53.320
credit in terms of author contributions,

00:26:53.320 --> 00:26:56.515
notes that, EV and RS,

00:26:56.515 --> 00:26:58.465
that's highlighted on the slide,

00:26:58.465 --> 00:27:01.670
those were the NCI authors.

00:27:02.010 --> 00:27:07.645
Well, they contributed to writing the manuscript,

00:27:07.645 --> 00:27:10.450
and the contribution statement also notes

00:27:10.450 --> 00:27:13.765
that all authors discussed and approved the manuscript.

00:27:13.765 --> 00:27:17.830
It's hard to know how much work they actually did and

00:27:17.830 --> 00:27:19.810
whether two out of 35 should put

00:27:19.810 --> 00:27:22.090
this in the public domain,

00:27:22.090 --> 00:27:25.645
but it's also hard to note,

00:27:25.645 --> 00:27:28.900
did Springer Nature actually get 35 signed

00:27:28.900 --> 00:27:31.390
copyright clearance statements from

00:27:31.390 --> 00:27:33.025
all the authors or not.

00:27:33.025 --> 00:27:37.045
So some of our preliminary conclusions

00:27:37.045 --> 00:27:40.540
based on this earlier research, is that,

00:27:40.540 --> 00:27:43.180
there's a significant number of incorrect assertions of

00:27:43.180 --> 00:27:47.080
copyright upwards to 46 percent,

00:27:47.080 --> 00:27:53.425
and that this inconsistent labeling practices

00:27:53.425 --> 00:27:56.795
for US works especially.

00:27:56.795 --> 00:27:59.175
We found this to be consistent

00:27:59.175 --> 00:28:01.605
across two very different disciplines and

00:28:01.605 --> 00:28:03.120
agencies that had

00:28:03.120 --> 00:28:05.520
very different open access profiles

00:28:05.520 --> 00:28:08.315
for their publications.

00:28:08.315 --> 00:28:14.875
Beyond the rational preliminary conclusions,

00:28:14.875 --> 00:28:20.650
you'll give me a brief moment for a wider thinking.

00:28:20.650 --> 00:28:23.425
It definitely raises the question whether copyright,

00:28:23.425 --> 00:28:26.709
to me, is a good fit for scholarship.

00:28:26.709 --> 00:28:30.235
I'll quote Kevin Smith,

00:28:30.235 --> 00:28:34.735
where in the previous session, where he said,

00:28:34.735 --> 00:28:36.370
copyright is a tool that gives

00:28:36.370 --> 00:28:39.160
monopoly power that's leveraged

00:28:39.160 --> 00:28:41.140
against the academy and to me,

00:28:41.140 --> 00:28:44.695
I think we see this partially in this study.

00:28:44.695 --> 00:28:48.474
I think about the Budapest Open Access Initiative goals,

00:28:48.474 --> 00:28:51.520
that literature that should be freely accessible

00:28:51.520 --> 00:28:53.890
online is that which scholars

00:28:53.890 --> 00:28:56.785
give to the world without expectation of payment.

00:28:56.785 --> 00:29:00.160
I also was really struck

00:29:00.160 --> 00:29:03.565
by a recent book chapter by Charlotte Roh,

00:29:03.565 --> 00:29:07.810
Harrison Inefuku and Emily Drabinsky in the MIT book

00:29:07.810 --> 00:29:11.320
Reassembling Scholarly Communications that also

00:29:11.320 --> 00:29:15.145
goes to the social justice aspect of this,

00:29:15.145 --> 00:29:17.470
and we don't want to suggest at

00:29:17.470 --> 00:29:19.480
all that open access is going to

00:29:19.480 --> 00:29:21.985
cure everything, and they say,

00:29:21.985 --> 00:29:24.430
while open access publishing advances

00:29:24.430 --> 00:29:27.715
equitable access to reading scholarly work,

00:29:27.715 --> 00:29:29.650
it does not automatically reverse

00:29:29.650 --> 00:29:32.290
the biases and norms of scholarship

00:29:32.290 --> 00:29:35.470
itself and that was really resonant with

00:29:35.470 --> 00:29:39.160
us and if you haven't checked out this recent book,

00:29:39.160 --> 00:29:40.750
I would also recommend that,

00:29:40.750 --> 00:29:44.870
just a lot of great chapters. Philip.

00:29:46.380 --> 00:29:51.490
Some of the implications for libraries include,

00:29:51.490 --> 00:29:53.920
that the inconsistent labeling practices

00:29:53.920 --> 00:29:56.304
contribute to confusion on reuse.

00:29:56.304 --> 00:29:59.230
Particularly when statements are missing,

00:29:59.230 --> 00:30:01.870
you have obviously a government author

00:30:01.870 --> 00:30:05.360
on the article and there's no recognition of that.

00:30:05.390 --> 00:30:07.590
We also realized that there is

00:30:07.590 --> 00:30:10.440
a complicated risk analysis.

00:30:10.440 --> 00:30:13.940
Neither of us are lawyers,

00:30:13.940 --> 00:30:16.900
but as we saw earlier,

00:30:16.900 --> 00:30:19.090
international law is complicated,

00:30:19.090 --> 00:30:21.250
this is a jurisdictional public domain,

00:30:21.250 --> 00:30:22.570
and so what about

00:30:22.570 --> 00:30:25.270
the articles in the international arena, and of course,

00:30:25.270 --> 00:30:26.920
if we take one of the articles and put

00:30:26.920 --> 00:30:29.530
them in one of our repositories,

00:30:29.530 --> 00:30:34.630
that's available worldwide, and as we saw

00:30:34.630 --> 00:30:40.765
the joint work issue with co-authors is also a problem.

00:30:40.765 --> 00:30:42.550
If you have 100 authors and

00:30:42.550 --> 00:30:46.070
only one government author, what does that mean?

00:30:46.070 --> 00:30:50.040
There's also the issue of texts versus the work.

00:30:50.040 --> 00:30:52.035
For example,

00:30:52.035 --> 00:30:54.690
some publishers might claim copyright in the formatting,

00:30:54.690 --> 00:30:58.930
particularly in other countries internationally.

00:30:59.180 --> 00:31:02.025
As Jimmy mentioned earlier,

00:31:02.025 --> 00:31:04.770
license collisions for lack of a better word,

00:31:04.770 --> 00:31:07.890
we saw a few with two statements,

00:31:07.890 --> 00:31:09.710
public domain and CC,

00:31:09.710 --> 00:31:13.750
or public domain and copyrighted, that sort of thing,

00:31:13.750 --> 00:31:17.410
particularly looking at public domain

00:31:17.410 --> 00:31:18.685
versus some of the more

00:31:18.685 --> 00:31:20.770
limiting Creative Commons licenses,

00:31:20.770 --> 00:31:25.100
like the NC and ND licenses, how does that work?

00:31:26.130 --> 00:31:28.570
We have lots of questions.

00:31:28.570 --> 00:31:30.400
The first and perhaps most important

00:31:30.400 --> 00:31:32.440
coming out of this presentation is that,

00:31:32.440 --> 00:31:35.380
if we are comfortable with public domain in

00:31:35.380 --> 00:31:37.390
the US status of

00:31:37.390 --> 00:31:40.510
these articles with any government author on them,

00:31:40.510 --> 00:31:42.580
should there be a concerted effort to

00:31:42.580 --> 00:31:44.215
put them in open repositories,

00:31:44.215 --> 00:31:47.080
because as we've seen, many of them are not open?

00:31:47.080 --> 00:31:50.530
Could this effort apply to states where

00:31:50.530 --> 00:31:54.370
government authored work is in the public domain?

00:31:54.370 --> 00:31:56.020
So both Jimmy and I are in Virginia,

00:31:56.020 --> 00:31:58.090
which I believe is one of the states

00:31:58.090 --> 00:32:00.490
where government works are in the public domain,

00:32:00.490 --> 00:32:02.230
and so could an employee of

00:32:02.230 --> 00:32:07.030
our Fish and Wildlife Service be put in

00:32:07.030 --> 00:32:09.200
an open repository?

00:32:09.630 --> 00:32:11.770
And as you saw in

00:32:11.770 --> 00:32:14.380
Jimmy's slide about going back through the decades,

00:32:14.380 --> 00:32:16.420
could public domain in the US be a source

00:32:16.420 --> 00:32:18.745
of open access for older articles?

00:32:18.745 --> 00:32:21.160
This has been one of the confounding things

00:32:21.160 --> 00:32:22.465
about open access

00:32:22.465 --> 00:32:24.460
as it advances every

00:32:24.460 --> 00:32:27.040
year and there's more and more work that's open access.

00:32:27.040 --> 00:32:28.840
We're still looking back at

00:32:28.840 --> 00:32:30.520
a huge back catalog, so to

00:32:30.520 --> 00:32:33.655
speak, of works that are behind paywalls.

00:32:33.655 --> 00:32:37.000
And could there be a technological solution for

00:32:37.000 --> 00:32:39.730
this divide between the licenses and access,

00:32:39.730 --> 00:32:44.500
as we saw on the screenshot of the Wiley journal article

00:32:44.500 --> 00:32:46.840
which says that the articles in

00:32:46.840 --> 00:32:48.280
the public domain in the US,

00:32:48.280 --> 00:32:50.185
but no, you don't have access to it.

00:32:50.185 --> 00:32:53.740
Could publishers potentially recognize

00:32:53.740 --> 00:32:57.070
IP addresses coming from the United States,

00:32:57.070 --> 00:33:03.160
for example, and offer access to those IP addresses.

00:33:03.160 --> 00:33:07.375
There are also other opportunities for libraries.

00:33:07.375 --> 00:33:10.240
Perhaps some of these issues could be putting

00:33:10.240 --> 00:33:14.604
license addendums to specify labeling best practice,

00:33:14.604 --> 00:33:17.845
to have rights to republish public domain material,

00:33:17.845 --> 00:33:19.945
and perhaps most importantly,

00:33:19.945 --> 00:33:23.890
to remove paywalls for the public domain material.

00:33:23.890 --> 00:33:26.949
What might a solution at scale look like?

00:33:26.949 --> 00:33:29.815
As Jimmy pointed out on one of his slides,

00:33:29.815 --> 00:33:31.030
we're maybe looking at

00:33:31.030 --> 00:33:34.160
hundreds of thousands of articles here.

00:33:34.470 --> 00:33:40.240
One thing we could potentially do is use Unpaywall to

00:33:40.240 --> 00:33:43.390
identify the "no OA copy" articles

00:33:43.390 --> 00:33:46.525
and prioritize those for IR upload.

00:33:46.525 --> 00:33:49.225
Because there is no open access version,

00:33:49.225 --> 00:33:50.560
and so that could be

00:33:50.560 --> 00:33:52.570
the low-hanging fruit that we focus on first,

00:33:52.570 --> 00:33:54.175
to get into IRs.

00:33:54.175 --> 00:33:56.125
We may also be able to help

00:33:56.125 --> 00:33:58.180
Unpaywall mark these as public domain US.

00:33:58.180 --> 00:33:59.740
When we got our spreadsheets back,

00:33:59.740 --> 00:34:03.205
we noticed that very few were marked PD,

00:34:03.205 --> 00:34:07.975
and so we might be able to help improve the data set.

00:34:07.975 --> 00:34:11.140
Also, Unpaywall has relationships with

00:34:11.140 --> 00:34:13.870
Web of Science and Scopus, and so part

00:34:13.870 --> 00:34:15.895
of that relationship might be

00:34:15.895 --> 00:34:18.520
getting the affiliation data from

00:34:18.520 --> 00:34:20.830
those two services and using

00:34:20.830 --> 00:34:25.970
that to inform the license metadata.

00:34:26.910 --> 00:34:29.410
We want to offer our thanks.

00:34:29.410 --> 00:34:31.450
A couple of months ago we

00:34:31.450 --> 00:34:34.300
attended the ASERL Copyright Office Hours,

00:34:34.300 --> 00:34:36.925
the Association of Southeast Research Libraries

00:34:36.925 --> 00:34:38.905
with a lot of these questions,

00:34:38.905 --> 00:34:42.100
and we are grateful for feedback from Brandon Butler and

00:34:42.100 --> 00:34:46.030
Laura Burtle at those office hours. As Jimmy mentioned,

00:34:46.030 --> 00:34:47.740
Peter Hirtle was very gracious in

00:34:47.740 --> 00:34:50.500
replying by email to some of our questions.

00:34:50.500 --> 00:34:52.840
Our scholarly communications colleagues for

00:34:52.840 --> 00:34:55.630
engaging the question on LISTSERV,

00:34:55.630 --> 00:34:57.670
and of course, the Web of Science for

00:34:57.670 --> 00:35:00.985
the affiliation data and Unpaywall,

00:35:00.985 --> 00:35:03.970
Our Research for the open access data.

00:35:03.970 --> 00:35:06.910
But we do want to make it clear that any errors and

00:35:06.910 --> 00:35:10.910
misunderstandings are totally ours.

00:35:11.820 --> 00:35:15.174
We have some selected references

00:35:15.174 --> 00:35:16.840
to the documents that were referred to,

00:35:16.840 --> 00:35:19.915
for example, the CENDI documents, the FAQs,

00:35:19.915 --> 00:35:21.700
and several others with

00:35:21.700 --> 00:35:24.805
links to those works that we'll be providing,

00:35:24.805 --> 00:35:27.175
and so we want to thank you very much for listening,

00:35:27.175 --> 00:35:31.160
and we're anxious to take your questions.
