WEBVTT

1
00:00:00.340 --> 00:00:20.660 A:middle L:90%
no. Okay. Yeah. Welcome, everyone.

2
00:00:22.039 --> 00:00:24.969 A:middle L:90%
Welcome to our kick off to open data Day,

3
00:00:24.969 --> 00:00:27.559 A:middle L:90%
which is tomorrow, 21st, where you hope the

4
00:00:27.570 --> 00:00:30.440 A:middle L:90%
weather holds off for us to have a good day

5
00:00:30.440 --> 00:00:35.939 A:middle L:90%
of discussions on open data as well as hackathon on

6
00:00:35.939 --> 00:00:39.270 A:middle L:90%
that, Um, so this is open day to

7
00:00:39.270 --> 00:00:41.649 A:middle L:90%
day and code across. And this is a collaboration

8
00:00:41.649 --> 00:00:44.649 A:middle L:90%
between the university libraries and code for an RV.

9
00:00:45.240 --> 00:00:48.740 A:middle L:90%
A brigade of code for America. Uh, here

10
00:00:48.740 --> 00:00:52.100 A:middle L:90%
in Blacksburg. Um, my name is Philip Young

11
00:00:52.109 --> 00:00:55.740 A:middle L:90%
. I am the Scarlet Communications librarian here at university

12
00:00:55.740 --> 00:00:57.649 A:middle L:90%
libraries. And so I work on a lot of

13
00:00:57.659 --> 00:01:02.969 A:middle L:90%
open projects Open scholarship, open access, open data

14
00:01:02.979 --> 00:01:04.760 A:middle L:90%
, which will be talking about tonight and tomorrow.

15
00:01:06.439 --> 00:01:10.760 A:middle L:90%
And we also work on open educational resources. Um

16
00:01:10.769 --> 00:01:12.730 A:middle L:90%
, in order to help students save a little money

17
00:01:12.739 --> 00:01:15.390 A:middle L:90%
on some of the they're learning materials. You may

18
00:01:15.390 --> 00:01:19.810 A:middle L:90%
have seen the posters outside the door about our events

19
00:01:19.810 --> 00:01:23.250 A:middle L:90%
next week. If you're interested in open textbooks and

20
00:01:23.250 --> 00:01:26.959 A:middle L:90%
other open educational resources, please go to our website

21
00:01:26.140 --> 00:01:29.390 A:middle L:90%
and look at some of the events that we have

22
00:01:29.390 --> 00:01:32.340 A:middle L:90%
on tap next week. And we also have our

23
00:01:32.349 --> 00:01:34.849 A:middle L:90%
We are librarian here, Anita, Laws and she'll

24
00:01:34.849 --> 00:01:37.590 A:middle L:90%
be glad to answer any questions that you might have

25
00:01:37.590 --> 00:01:41.670 A:middle L:90%
about open educational resources. Um, in the back

26
00:01:41.680 --> 00:01:42.909 A:middle L:90%
, we have a sign up sheet. Um,

27
00:01:42.920 --> 00:01:47.560 A:middle L:90%
where VT faculty can get NL. I credit for

28
00:01:47.560 --> 00:01:49.040 A:middle L:90%
attending tonight. So if you want to get an

29
00:01:49.040 --> 00:01:51.750 A:middle L:90%
l. I credit, please sign up there.

30
00:01:52.239 --> 00:01:55.359 A:middle L:90%
Also, we have a sign up box for the

31
00:01:56.140 --> 00:01:57.379 A:middle L:90%
open V t listserv, which is, uh,

32
00:01:57.390 --> 00:02:00.239 A:middle L:90%
listserv campus-wide listserv, where we talk about a variety

33
00:02:00.239 --> 00:02:04.400 A:middle L:90%
of open issues, including open access, open data

34
00:02:04.409 --> 00:02:07.689 A:middle L:90%
, and O er, um, in addition to

35
00:02:07.689 --> 00:02:12.009 A:middle L:90%
our speaker tonight, we have a distinguished guest that

36
00:02:12.009 --> 00:02:15.680 A:middle L:90%
I want to recognize A from the Virginia Tech Board

37
00:02:15.680 --> 00:02:16.909 A:middle L:90%
of Visitors. We have Nancy die in our presence

38
00:02:16.909 --> 00:02:20.770 A:middle L:90%
, so thank you very much, Nancy, for

39
00:02:20.770 --> 00:02:23.120 A:middle L:90%
coming tonight to our event. Um, as I

40
00:02:23.120 --> 00:02:27.289 A:middle L:90%
mentioned, this is a collaboration between the university libraries

41
00:02:27.289 --> 00:02:29.759 A:middle L:90%
here at Virginia Tech and code for Inter V.

42
00:02:30.340 --> 00:02:32.590 A:middle L:90%
Um, the co leaders of code for N RV

43
00:02:32.599 --> 00:02:36.430 A:middle L:90%
, R Neal, Neal, Farrah, Ben and

44
00:02:36.430 --> 00:02:38.639 A:middle L:90%
Ben Schoenfeld. And I want to introduce Ben to

45
00:02:38.639 --> 00:02:42.080 A:middle L:90%
come up and talk a little bit about code for

46
00:02:42.080 --> 00:02:49.889 A:middle L:90%
America. Everybody. My name is Ben. I

47
00:02:49.900 --> 00:02:53.189 A:middle L:90%
graduated from Virginia Tech in 2000 and nine with a

48
00:02:53.189 --> 00:02:54.560 A:middle L:90%
degree in computer engineering. Uh, and I'm a

49
00:02:54.560 --> 00:02:58.659 A:middle L:90%
software developer by day, but on nights and weekends

50
00:02:58.669 --> 00:03:00.090 A:middle L:90%
, I volunteer for code for America. And so

51
00:03:00.090 --> 00:03:01.479 A:middle L:90%
I'd like to take a few minutes just to tell

52
00:03:01.479 --> 00:03:05.830 A:middle L:90%
you about code for America since code across, Um

53
00:03:05.840 --> 00:03:07.449 A:middle L:90%
, an open day to day are a code for

54
00:03:07.449 --> 00:03:13.419 A:middle L:90%
America nationwide event. So how many of you are

55
00:03:13.419 --> 00:03:15.729 A:middle L:90%
old enough to remember a time when you had to

56
00:03:15.729 --> 00:03:17.819 A:middle L:90%
go into the bank and interact with someone, uh

57
00:03:17.830 --> 00:03:20.860 A:middle L:90%
, to actually get money or deposit a check?

58
00:03:21.639 --> 00:03:23.580 A:middle L:90%
Um, it seems, uh, the memory seems

59
00:03:23.580 --> 00:03:29.699 A:middle L:90%
quaint because we know how much personal finance has revolutionized

60
00:03:29.710 --> 00:03:32.020 A:middle L:90%
. Um has been revolutionized over the last 24 25

61
00:03:32.020 --> 00:03:36.770 A:middle L:90%
years. So now we can just take a picture

62
00:03:36.770 --> 00:03:38.780 A:middle L:90%
of a check, and minutes later, the money

63
00:03:38.780 --> 00:03:40.080 A:middle L:90%
shows up in our bank. Um and so this

64
00:03:40.090 --> 00:03:44.439 A:middle L:90%
this technology has, uh, has been or technology

65
00:03:44.439 --> 00:03:47.110 A:middle L:90%
has revolutionized the banking industry, and it's revolutionized almost

66
00:03:47.110 --> 00:03:49.960 A:middle L:90%
every other parts of our lives. Uh, but

67
00:03:49.960 --> 00:03:52.849 A:middle L:90%
there's one place that technology hasn't quite revolutionized yet,

68
00:03:52.860 --> 00:03:55.099 A:middle L:90%
and that's that's the government. So this is the

69
00:03:55.099 --> 00:03:57.750 A:middle L:90%
D. M. V. Uh, it's a

70
00:03:57.750 --> 00:04:00.310 A:middle L:90%
symbol of bureaucracy. Uh, so many people here

71
00:04:00.319 --> 00:04:02.250 A:middle L:90%
or everyone here has to interact with the NBA at

72
00:04:02.250 --> 00:04:04.349 A:middle L:90%
some point, and almost no one has a positive

73
00:04:04.349 --> 00:04:11.250 A:middle L:90%
experience. So if if 91% of Americans own a

74
00:04:11.250 --> 00:04:15.139 A:middle L:90%
cell phone, 67% use Facebook and 33% have a

75
00:04:15.139 --> 00:04:16.540 A:middle L:90%
tablet, why is this still how we engage with

76
00:04:16.550 --> 00:04:18.759 A:middle L:90%
government? There has to be a better way.

77
00:04:20.240 --> 00:04:25.730 A:middle L:90%
Um, good for America is all about trying to

78
00:04:25.730 --> 00:04:30.290 A:middle L:90%
revolutionize how we interact with government. The organization was

79
00:04:30.290 --> 00:04:33.050 A:middle L:90%
founded to change the culture inside government that supports bureaucracy

80
00:04:33.439 --> 00:04:36.360 A:middle L:90%
, breeds disengagement with citizens and makes it hard for

81
00:04:36.360 --> 00:04:40.410 A:middle L:90%
government to come up with innovative solutions to long standing

82
00:04:40.410 --> 00:04:44.639 A:middle L:90%
problems, all using modern network digital technology and user

83
00:04:44.639 --> 00:04:47.189 A:middle L:90%
centered design approaches. We take four approaches. We

84
00:04:47.189 --> 00:04:50.009 A:middle L:90%
work directly with government officials at the local level to

85
00:04:50.009 --> 00:04:54.149 A:middle L:90%
create the capacity inside government to build innovative solutions to

86
00:04:54.149 --> 00:04:58.069 A:middle L:90%
hard problems. We build communities of technologists and citizens

87
00:04:58.079 --> 00:05:00.110 A:middle L:90%
who want to lend their skills to help build their

88
00:05:00.110 --> 00:05:03.829 A:middle L:90%
governments. We build tools that make citizens citizen interactions

89
00:05:03.829 --> 00:05:06.899 A:middle L:90%
with government easier, simpler and more elegant so that

90
00:05:06.899 --> 00:05:10.860 A:middle L:90%
the experience of government is positive and breach trust.

91
00:05:11.639 --> 00:05:15.050 A:middle L:90%
We incubate and accelerate civic startups to create new economic

92
00:05:15.050 --> 00:05:17.560 A:middle L:90%
models for those tools and in this were influenced by

93
00:05:17.560 --> 00:05:19.860 A:middle L:90%
the idea that government should act like a platform.

94
00:05:20.339 --> 00:05:24.180 A:middle L:90%
Before the iPhone phones had 20 or 30 applications,

95
00:05:24.180 --> 00:05:26.680 A:middle L:90%
and now they have millions. And we know that

96
00:05:26.680 --> 00:05:30.180 A:middle L:90%
when governments open data, private companies can deliver innovative

97
00:05:30.180 --> 00:05:33.389 A:middle L:90%
solutions. So code for America has five parts.

98
00:05:33.399 --> 00:05:36.339 A:middle L:90%
Um, and tonight I'll just briefly talk about the

99
00:05:36.350 --> 00:05:39.230 A:middle L:90%
part we participated in, which is the brigade.

100
00:05:39.240 --> 00:05:41.600 A:middle L:90%
So we are code for New River Valley. Uh

101
00:05:41.610 --> 00:05:45.399 A:middle L:90%
, we are all volunteer chapter of code for America

102
00:05:45.410 --> 00:05:47.259 A:middle L:90%
. Right here in Blacksburg. We are the fourth

103
00:05:47.540 --> 00:05:50.389 A:middle L:90%
Brigade in Virginia. There's one in Hampton Roads,

104
00:05:50.399 --> 00:05:54.029 A:middle L:90%
one in Richmond, run in northern Virginia. And

105
00:05:54.029 --> 00:05:58.389 A:middle L:90%
we just started up in December. Um, so

106
00:05:58.389 --> 00:06:00.990 A:middle L:90%
like I said, there are good for America.

107
00:06:00.990 --> 00:06:03.180 A:middle L:90%
Brigades are springing up all over the country completely organically

108
00:06:03.189 --> 00:06:05.610 A:middle L:90%
. Uh, the brigade program started about three years

109
00:06:05.610 --> 00:06:09.209 A:middle L:90%
ago. Uh, this time last year, there

110
00:06:09.209 --> 00:06:11.259 A:middle L:90%
were 10 or 20 brigades, and now there are

111
00:06:11.269 --> 00:06:15.139 A:middle L:90%
about 100 brigades nationwide, Uh, and and internationally

112
00:06:15.139 --> 00:06:21.180 A:middle L:90%
, too. So our brigades at advocate for open

113
00:06:21.180 --> 00:06:24.910 A:middle L:90%
data and, uh, the best way we know

114
00:06:24.910 --> 00:06:27.779 A:middle L:90%
how to do this is by taking open data and

115
00:06:27.779 --> 00:06:30.100 A:middle L:90%
showing people what you can do with it and how

116
00:06:30.100 --> 00:06:31.850 A:middle L:90%
powerful it can be. In Hampton Roads, we

117
00:06:31.850 --> 00:06:35.100 A:middle L:90%
created a real time bus tracker. We got,

118
00:06:35.110 --> 00:06:40.220 A:middle L:90%
um, data from the bus company, and we

119
00:06:40.220 --> 00:06:41.920 A:middle L:90%
built that nice web app that you saw that you

120
00:06:41.920 --> 00:06:44.250 A:middle L:90%
can get to on your iPhone. But accessibility really

121
00:06:44.250 --> 00:06:46.379 A:middle L:90%
matters to us. So we also built a text

122
00:06:46.379 --> 00:06:48.689 A:middle L:90%
message interface because we want these technologies to get to

123
00:06:48.689 --> 00:06:53.449 A:middle L:90%
as many people as possible. This is what the

124
00:06:53.449 --> 00:06:55.319 A:middle L:90%
open data looks like that we got. So this

125
00:06:55.319 --> 00:06:57.430 A:middle L:90%
is just a big file. Um, and you

126
00:06:57.430 --> 00:07:00.019 A:middle L:90%
might recognize some numbers in there. Uh, there's

127
00:07:00.029 --> 00:07:01.889 A:middle L:90%
dates and lat long coordinates. Um, and so

128
00:07:01.889 --> 00:07:04.540 A:middle L:90%
we took that data, and we turned it into

129
00:07:04.550 --> 00:07:06.480 A:middle L:90%
what? You see what you saw on the iPhone

130
00:07:06.480 --> 00:07:10.639 A:middle L:90%
there? That map and help people find the schedule

131
00:07:10.639 --> 00:07:12.879 A:middle L:90%
and find how to catch the bus. And we

132
00:07:12.879 --> 00:07:15.930 A:middle L:90%
currently have over 200 users a day use that application

133
00:07:15.939 --> 00:07:19.259 A:middle L:90%
? Um, really, without any kind of promotion

134
00:07:19.269 --> 00:07:21.399 A:middle L:90%
behind it. So it's been really helpful to a

135
00:07:21.399 --> 00:07:23.860 A:middle L:90%
lot of people, and we're really proud of it

136
00:07:25.240 --> 00:07:28.319 A:middle L:90%
. Blacksburg has real time bus data to of course

137
00:07:28.329 --> 00:07:30.949 A:middle L:90%
, Blacksburg is a little more advanced in their transit

138
00:07:30.949 --> 00:07:32.430 A:middle L:90%
than a lot of places, thanks to all the

139
00:07:32.430 --> 00:07:34.769 A:middle L:90%
people who read the bus here. But, um

140
00:07:34.779 --> 00:07:36.720 A:middle L:90%
, we could do some really some really interesting and

141
00:07:36.720 --> 00:07:39.870 A:middle L:90%
innovative things with this this real time data, and

142
00:07:39.870 --> 00:07:42.250 A:middle L:90%
I hope to to work on some of those tomorrow

143
00:07:43.740 --> 00:07:46.870 A:middle L:90%
. Another application that brigades in Virginia developed is open

144
00:07:46.870 --> 00:07:49.519 A:middle L:90%
health inspection dot com. Uh, all restaurants are

145
00:07:49.529 --> 00:07:53.449 A:middle L:90%
visited, uh, quarterly, I believe, by

146
00:07:53.449 --> 00:07:56.879 A:middle L:90%
the health inspector to check for violations. And,

147
00:07:56.889 --> 00:07:59.649 A:middle L:90%
um, that information is public and it's up on

148
00:07:59.649 --> 00:08:01.769 A:middle L:90%
a website somewhere that's very difficult to access. Um

149
00:08:01.779 --> 00:08:03.540 A:middle L:90%
, it's hard to find, and it's very difficult

150
00:08:03.540 --> 00:08:05.000 A:middle L:90%
to get to on your mobile phone. So we

151
00:08:05.000 --> 00:08:07.220 A:middle L:90%
created an application. Uh, you can pull it

152
00:08:07.220 --> 00:08:11.029 A:middle L:90%
up on your smartphone, and, um, it'll

153
00:08:11.040 --> 00:08:13.459 A:middle L:90%
it'll figure out where you are, and I'll show

154
00:08:13.459 --> 00:08:15.019 A:middle L:90%
you the closest restaurants to you. I pulled this

155
00:08:15.019 --> 00:08:16.769 A:middle L:90%
up before I came in here, and, uh

156
00:08:16.779 --> 00:08:18.470 A:middle L:90%
, you can see that, uh, that these

157
00:08:18.470 --> 00:08:20.949 A:middle L:90%
are all the restaurants around. We also created this

158
00:08:20.949 --> 00:08:22.680 A:middle L:90%
scoring system because Virginia doesn't have a scoring system like

159
00:08:22.680 --> 00:08:24.959 A:middle L:90%
some states do. Uh, so we created this

160
00:08:24.970 --> 00:08:28.180 A:middle L:90%
one of our data. Scientists created this, uh

161
00:08:28.189 --> 00:08:30.860 A:middle L:90%
, to try to help people understand the violation system

162
00:08:31.840 --> 00:08:35.279 A:middle L:90%
. And finally, another project that we came up

163
00:08:35.279 --> 00:08:39.139 A:middle L:90%
with back in November was done with a collaboration with

164
00:08:39.139 --> 00:08:41.860 A:middle L:90%
some reporters at the run of times and they came

165
00:08:41.860 --> 00:08:43.340 A:middle L:90%
up. They came to our brigade and said,

166
00:08:43.350 --> 00:08:46.769 A:middle L:90%
Uh, we have difficulty searching on, uh,

167
00:08:46.779 --> 00:08:50.970 A:middle L:90%
searching court cases on the Virginia court website. They

168
00:08:50.970 --> 00:08:52.980 A:middle L:90%
make you search one locality at the time, and

169
00:08:52.980 --> 00:08:54.769 A:middle L:90%
we like to search all of them and 120 localities

170
00:08:54.769 --> 00:08:58.360 A:middle L:90%
takes a long time to search through. So in

171
00:08:58.370 --> 00:09:00.519 A:middle L:90%
just over a weekend, we built this site that

172
00:09:00.519 --> 00:09:03.840 A:middle L:90%
allows them to search through localities, and we get

173
00:09:03.840 --> 00:09:05.519 A:middle L:90%
about 10 users a day on this site. And

174
00:09:05.519 --> 00:09:07.450 A:middle L:90%
so it's been really helpful to, uh to our

175
00:09:07.450 --> 00:09:11.399 A:middle L:90%
journalists all over all over the state. Um,

176
00:09:11.399 --> 00:09:13.830 A:middle L:90%
so now we'll talk about some of the things we

177
00:09:13.830 --> 00:09:16.559 A:middle L:90%
have to accomplish tomorrow at open day to day.

178
00:09:16.570 --> 00:09:18.730 A:middle L:90%
Um, the main thing we'd like to we'd like

179
00:09:18.730 --> 00:09:20.340 A:middle L:90%
to do is have these round table discussions, and

180
00:09:20.350 --> 00:09:24.519 A:middle L:90%
the idea is just like the last example I provided

181
00:09:24.529 --> 00:09:26.070 A:middle L:90%
. If we get subject matter experts in the same

182
00:09:26.070 --> 00:09:30.289 A:middle L:90%
room with civic minded volunteers, we can create some

183
00:09:30.289 --> 00:09:33.740 A:middle L:90%
really interesting applications. And so we're gonna have these

184
00:09:33.740 --> 00:09:35.850 A:middle L:90%
roundtables where we sit down and talk about mapping and

185
00:09:35.860 --> 00:09:39.110 A:middle L:90%
G i s systems. We're gonna talk more about

186
00:09:39.110 --> 00:09:39.720 A:middle L:90%
journalism and see if we can come up with any

187
00:09:39.720 --> 00:09:43.059 A:middle L:90%
more ideas like the one we had there. We're

188
00:09:43.059 --> 00:09:46.129 A:middle L:90%
gonna talk about public policy and trying to use data

189
00:09:46.139 --> 00:09:48.990 A:middle L:90%
to make, um, data driven decisions in in

190
00:09:48.990 --> 00:09:50.570 A:middle L:90%
cities and and state government. And we're also going

191
00:09:50.570 --> 00:09:52.779 A:middle L:90%
to talk about health. Um, v b I

192
00:09:52.779 --> 00:09:56.149 A:middle L:90%
here at Virginia Tech had a great, um,

193
00:09:56.159 --> 00:09:58.580 A:middle L:90%
hackathon a few months ago about on on Ebola data

194
00:09:58.590 --> 00:10:01.269 A:middle L:90%
. And so we hope to carry that torch and

195
00:10:01.269 --> 00:10:03.559 A:middle L:90%
maybe come up with some more innovative things that we

196
00:10:03.559 --> 00:10:07.539 A:middle L:90%
can do at the same time in the background,

197
00:10:07.539 --> 00:10:09.019 A:middle L:90%
we're gonna be running a hackathon. Uh, so

198
00:10:09.029 --> 00:10:11.220 A:middle L:90%
the ideas that come out of these roundtables we hope

199
00:10:11.220 --> 00:10:16.559 A:middle L:90%
to start implementing, uh, immediately and some of

200
00:10:16.559 --> 00:10:18.220 A:middle L:90%
the themes like I discussed before transportation. We've got

201
00:10:18.220 --> 00:10:24.039 A:middle L:90%
that real time transit ap. We've also got some

202
00:10:24.049 --> 00:10:26.669 A:middle L:90%
some parking data, and we'd like to try to

203
00:10:26.669 --> 00:10:28.039 A:middle L:90%
make finding a parking spot in downtown Blacksburg a little

204
00:10:28.039 --> 00:10:31.659 A:middle L:90%
easier. Uh, we're like I said, we're

205
00:10:31.659 --> 00:10:33.009 A:middle L:90%
gonna work on the health data, the public policy

206
00:10:33.009 --> 00:10:37.929 A:middle L:90%
data. We've got a great new API for finding

207
00:10:37.929 --> 00:10:41.659 A:middle L:90%
out about information about Virginia businesses. Um, and

208
00:10:41.669 --> 00:10:43.029 A:middle L:90%
we've got the court data that I mentioned before that

209
00:10:43.029 --> 00:10:46.049 A:middle L:90%
we can look through and we've got some other great

210
00:10:46.059 --> 00:10:48.059 A:middle L:90%
data sets. Virginia, the governor. The governor's

211
00:10:48.059 --> 00:10:50.549 A:middle L:90%
office just launched an open data portal with with a

212
00:10:50.549 --> 00:10:52.100 A:middle L:90%
lot of great resources on it. We have to

213
00:10:52.100 --> 00:10:54.350 A:middle L:90%
leverage. And of course, there's a U.

214
00:10:54.350 --> 00:10:56.259 A:middle L:90%
S. Census api, that's that's really good.

215
00:10:58.840 --> 00:11:00.580 A:middle L:90%
And finally, if hacking is not your thing,

216
00:11:00.580 --> 00:11:01.580 A:middle L:90%
if you're not a programmer, uh, we're gonna

217
00:11:01.580 --> 00:11:05.000 A:middle L:90%
be we'll have a few activities. Um, for

218
00:11:05.009 --> 00:11:07.350 A:middle L:90%
anyone who knows how to Google, uh, we'll

219
00:11:07.350 --> 00:11:11.370 A:middle L:90%
have open street map editing where, uh, we'll

220
00:11:11.370 --> 00:11:13.549 A:middle L:90%
pull information into open street map. It's a really

221
00:11:13.549 --> 00:11:16.259 A:middle L:90%
easy editor. Just click and type. We've also

222
00:11:16.259 --> 00:11:18.649 A:middle L:90%
got to city censuses. So these are the open

223
00:11:18.649 --> 00:11:20.769 A:middle L:90%
data. Census is, uh, is a collection

224
00:11:20.779 --> 00:11:26.379 A:middle L:90%
of data sets. That code for America has found

225
00:11:26.389 --> 00:11:28.960 A:middle L:90%
that most cities have and should have, um,

226
00:11:28.970 --> 00:11:31.409 A:middle L:90%
openly accessible, and many cities have already contributed.

227
00:11:31.409 --> 00:11:33.090 A:middle L:90%
You can go out on the list and see other

228
00:11:33.090 --> 00:11:35.149 A:middle L:90%
cities and how accessible their data sets are and where

229
00:11:35.149 --> 00:11:37.250 A:middle L:90%
they are. And so we'd like to put Blacksburg

230
00:11:37.250 --> 00:11:39.450 A:middle L:90%
in there and get it on the map. We're

231
00:11:39.450 --> 00:11:43.340 A:middle L:90%
also going to do a new census called the local

232
00:11:43.350 --> 00:11:48.080 A:middle L:90%
Digital Services Census. And so that's a collection of

233
00:11:48.080 --> 00:11:50.549 A:middle L:90%
how accessible things like, um, how easy is

234
00:11:50.549 --> 00:11:52.889 A:middle L:90%
it to go find your town council members and and

235
00:11:52.889 --> 00:11:54.539 A:middle L:90%
figure out how to contact them? And how easy

236
00:11:54.539 --> 00:11:56.159 A:middle L:90%
is it to find? Um, you know,

237
00:11:56.159 --> 00:11:58.720 A:middle L:90%
bus schedules and things like that. So if you

238
00:11:58.720 --> 00:12:01.179 A:middle L:90%
can, If you can get on a computer and

239
00:12:01.190 --> 00:12:03.200 A:middle L:90%
find this information and then write down in the census

240
00:12:03.200 --> 00:12:07.679 A:middle L:90%
how easy it is to find, then you'd be

241
00:12:07.679 --> 00:12:09.700 A:middle L:90%
able to help us. Breakfast and lunch will be

242
00:12:09.700 --> 00:12:13.330 A:middle L:90%
provided tomorrow. We'll start here at nine o'clock in

243
00:12:13.330 --> 00:12:16.159 A:middle L:90%
the morning and go till five weather permitting. Uh

244
00:12:16.169 --> 00:12:16.559 A:middle L:90%
, and so we hope to see you all there

245
00:12:18.139 --> 00:12:26.750 A:middle L:90%
. Thanks. All right. Thanks, Ben.

246
00:12:26.639 --> 00:12:31.860 A:middle L:90%
So we're really honored to have tonight, uh,

247
00:12:31.440 --> 00:12:37.179 A:middle L:90%
follow Jake with who drove down here? Waldo Jake

248
00:12:37.179 --> 00:12:37.899 A:middle L:90%
with is the director of the U. S.

249
00:12:37.909 --> 00:12:41.740 A:middle L:90%
Open Data Institute, an organization building the capacity of

250
00:12:41.740 --> 00:12:46.659 A:middle L:90%
open data and supporting government in that mission in 2011

251
00:12:46.669 --> 00:12:50.360 A:middle L:90%
. In acknowledgement of his open data work. Jake

252
00:12:50.360 --> 00:12:52.409 A:middle L:90%
with was named a champion of change by the White

253
00:12:52.409 --> 00:12:56.659 A:middle L:90%
House and in 2012, and open Gove, champion

254
00:12:56.139 --> 00:13:00.190 A:middle L:90%
by the Sunlight Foundation. He went on to work

255
00:13:00.200 --> 00:13:01.940 A:middle L:90%
and open data with the White House Office of Science

256
00:13:01.940 --> 00:13:07.509 A:middle L:90%
and Technology Policy. Jake with is a 2005 Virginia

257
00:13:07.509 --> 00:13:11.129 A:middle L:90%
Tech graduate and lives near Charlottesville with his wife and

258
00:13:11.129 --> 00:13:13.990 A:middle L:90%
son. And just this afternoon, he was telling

259
00:13:13.990 --> 00:13:16.169 A:middle L:90%
me that he had an offer to speak about open

260
00:13:16.169 --> 00:13:20.460 A:middle L:90%
data in Europe, but he chose Blacksburg. Please

261
00:13:20.460 --> 00:13:28.360 A:middle L:90%
welcome. Although Jake with Yeah, I'm glad to

262
00:13:28.360 --> 00:13:30.250 A:middle L:90%
be here. I I have not been to Blacksburg

263
00:13:30.250 --> 00:13:33.409 A:middle L:90%
since 2007. Uh, and I graduated from Virginia

264
00:13:33.409 --> 00:13:35.570 A:middle L:90%
Tech in 2000 and five, although confess Italy as

265
00:13:35.570 --> 00:13:39.490 A:middle L:90%
a late life college student. So I'm going to

266
00:13:39.490 --> 00:13:41.220 A:middle L:90%
talk to you all for for a little while about

267
00:13:41.230 --> 00:13:43.980 A:middle L:90%
, uh first, I would explain generally about open

268
00:13:43.980 --> 00:13:46.740 A:middle L:90%
data, but it is how it works. Why

269
00:13:46.740 --> 00:13:48.179 A:middle L:90%
? It's useful, how it's different than closed data

270
00:13:48.179 --> 00:13:52.159 A:middle L:90%
or information generally, or things like that. Then

271
00:13:52.159 --> 00:13:52.929 A:middle L:90%
we're gonna look at just basically tell you a couple

272
00:13:52.929 --> 00:13:56.210 A:middle L:90%
of stories about the process of how I've opened a

273
00:13:56.210 --> 00:13:58.220 A:middle L:90%
couple of data sets. And then we'll do a

274
00:13:58.220 --> 00:14:01.809 A:middle L:90%
quick rundown of how things are looking in Virginia,

275
00:14:01.809 --> 00:14:03.950 A:middle L:90%
sort of a state of the commonwealth for data here

276
00:14:05.740 --> 00:14:07.370 A:middle L:90%
. So I want to begin with the question of

277
00:14:07.379 --> 00:14:11.169 A:middle L:90%
what is open data. So here's some context.

278
00:14:11.169 --> 00:14:13.450 A:middle L:90%
We have the whole universe of data, which is

279
00:14:13.450 --> 00:14:15.789 A:middle L:90%
the biggest circle there. So open data is just

280
00:14:15.789 --> 00:14:18.460 A:middle L:90%
a small subset of that. That's the portion of

281
00:14:18.460 --> 00:14:20.279 A:middle L:90%
all data is publicly accessible. And believe me,

282
00:14:20.299 --> 00:14:22.889 A:middle L:90%
this is not to scale in any way. This

283
00:14:22.889 --> 00:14:24.639 A:middle L:90%
circle for open it would be so tiny you would

284
00:14:24.639 --> 00:14:26.009 A:middle L:90%
not be able to spot. It's like when you

285
00:14:26.009 --> 00:14:30.110 A:middle L:90%
see, like, how many Earths can you fit

286
00:14:30.110 --> 00:14:31.360 A:middle L:90%
inside the sun like a million or whatever? It's

287
00:14:31.360 --> 00:14:33.179 A:middle L:90%
an absurd number. It's a little bit like that

288
00:14:33.179 --> 00:14:37.309 A:middle L:90%
in scale. There is some overlap with big data

289
00:14:37.309 --> 00:14:39.850 A:middle L:90%
open open data, but they're really basically unrelated.

290
00:14:39.340 --> 00:14:41.830 A:middle L:90%
People like to conflate them. The I think the

291
00:14:41.840 --> 00:14:46.120 A:middle L:90%
governor even lumps big data and open data into the

292
00:14:46.120 --> 00:14:48.379 A:middle L:90%
same bucket. Uh, this doesn't actually make any

293
00:14:48.379 --> 00:14:50.720 A:middle L:90%
sense in any way. A lot of the most

294
00:14:50.720 --> 00:14:52.350 A:middle L:90%
valuable open data is really, really small. Uh

295
00:14:52.360 --> 00:14:54.830 A:middle L:90%
, it is a little data it would fit on

296
00:14:54.830 --> 00:14:56.009 A:middle L:90%
a floppy disk. That's some really, really useful

297
00:14:56.009 --> 00:14:58.620 A:middle L:90%
data. So although my organizations work is an open

298
00:14:58.620 --> 00:15:01.629 A:middle L:90%
data, generally, most of my remarks here are

299
00:15:01.629 --> 00:15:03.429 A:middle L:90%
going to be confined to a subset of that which

300
00:15:03.429 --> 00:15:05.490 A:middle L:90%
is open government data. So here's the difference between

301
00:15:05.490 --> 00:15:09.840 A:middle L:90%
data and open data. Here's what allows data to

302
00:15:09.840 --> 00:15:13.190 A:middle L:90%
be transmuted into gold into open data, so the

303
00:15:13.190 --> 00:15:16.379 A:middle L:90%
first is being able to get it. You can't

304
00:15:16.379 --> 00:15:18.019 A:middle L:90%
get it. If it's not possible for you to

305
00:15:18.019 --> 00:15:20.139 A:middle L:90%
acquire it, then it is clearly not in fact

306
00:15:20.230 --> 00:15:24.039 A:middle L:90%
open. Ah, to be data it has to

307
00:15:24.039 --> 00:15:26.029 A:middle L:90%
be readable by software. Software has to be able

308
00:15:26.029 --> 00:15:28.039 A:middle L:90%
to do something with it. But there's another level

309
00:15:28.039 --> 00:15:31.279 A:middle L:90%
of of readable, and this is a little harder

310
00:15:31.279 --> 00:15:33.230 A:middle L:90%
to explain. And every time I explain it,

311
00:15:33.230 --> 00:15:35.090 A:middle L:90%
I hope I get a little better. And that

312
00:15:35.090 --> 00:15:39.259 A:middle L:90%
is the meaning of that data. The actual values

313
00:15:39.259 --> 00:15:41.370 A:middle L:90%
contained within it. Software has to be able to

314
00:15:41.370 --> 00:15:43.850 A:middle L:90%
understand it, so I'll make a short distinction here

315
00:15:43.850 --> 00:15:46.789 A:middle L:90%
, but I'll explain it better with some some graphics

316
00:15:46.789 --> 00:15:50.600 A:middle L:90%
in a minute here. If I have a spreadsheet

317
00:15:50.610 --> 00:15:52.870 A:middle L:90%
of a budget for a municipality and I take a

318
00:15:52.870 --> 00:15:54.649 A:middle L:90%
photo of it and I email it to somebody.

319
00:15:56.240 --> 00:15:58.480 A:middle L:90%
Well, that is data in the sense that it's

320
00:15:58.480 --> 00:16:02.250 A:middle L:90%
a photo that's on a computer. But the numbers

321
00:16:02.250 --> 00:16:03.159 A:middle L:90%
on those spreadsheets, I can do nothing with them

322
00:16:03.740 --> 00:16:06.529 A:middle L:90%
. I can't fiddle with the numbers and see how

323
00:16:06.529 --> 00:16:08.159 A:middle L:90%
it changes. It's just a picture, so that

324
00:16:08.159 --> 00:16:11.620 A:middle L:90%
is data in the abstract. But it's not data

325
00:16:11.620 --> 00:16:14.820 A:middle L:90%
in the sense that software can't read the information within

326
00:16:14.820 --> 00:16:15.960 A:middle L:90%
that spreadsheet. And so that's That's what I mean

327
00:16:15.960 --> 00:16:19.980 A:middle L:90%
by readable by software. Uh, if you're paying

328
00:16:19.980 --> 00:16:22.980 A:middle L:90%
for it, it's a lot less open if you

329
00:16:22.980 --> 00:16:25.039 A:middle L:90%
have to pay for that data in order to receive

330
00:16:25.039 --> 00:16:26.190 A:middle L:90%
it. It's not that it's not open. We're

331
00:16:26.190 --> 00:16:27.450 A:middle L:90%
talking about a sliding scale of openness here. It's

332
00:16:27.450 --> 00:16:30.379 A:middle L:90%
not an existential issue whether data is open, but

333
00:16:30.379 --> 00:16:33.940 A:middle L:90%
if you're paying for it, it's substantially less open

334
00:16:33.299 --> 00:16:36.230 A:middle L:90%
. And finally, you have to be able to

335
00:16:36.230 --> 00:16:37.500 A:middle L:90%
share it and do stuff with it. If it's

336
00:16:37.500 --> 00:16:40.600 A:middle L:90%
shared with you and only you and you can't share

337
00:16:40.600 --> 00:16:41.190 A:middle L:90%
it with anybody else, you can't change it.

338
00:16:41.190 --> 00:16:45.539 A:middle L:90%
You can't copyright prohibits that, or if license prohibits

339
00:16:45.539 --> 00:16:48.340 A:middle L:90%
it, it's also not very open. Now there's

340
00:16:48.340 --> 00:16:51.190 A:middle L:90%
a seven step measure and there's an 11 statement,

341
00:16:51.320 --> 00:16:52.700 A:middle L:90%
so there's a very spoiled down version of what it

342
00:16:52.700 --> 00:16:56.100 A:middle L:90%
means for data to be open. But we do

343
00:16:56.110 --> 00:16:57.210 A:middle L:90%
. We do speak of data being more or less

344
00:16:57.220 --> 00:17:00.600 A:middle L:90%
open. I'm trying to train myself to lose the

345
00:17:00.600 --> 00:17:03.110 A:middle L:90%
absolutist language. The data either is or is not

346
00:17:03.110 --> 00:17:06.259 A:middle L:90%
open. That is a false dichotomy. So let's

347
00:17:06.259 --> 00:17:07.190 A:middle L:90%
use weather data as an example, as it is

348
00:17:07.190 --> 00:17:10.650 A:middle L:90%
our most hackneyed example of open data. So I

349
00:17:10.650 --> 00:17:12.049 A:middle L:90%
will just keep going with it. Uh, so

350
00:17:12.059 --> 00:17:15.009 A:middle L:90%
going back in history, I think it's a good

351
00:17:15.009 --> 00:17:17.410 A:middle L:90%
argument you made that open data open. Government data

352
00:17:17.410 --> 00:17:21.329 A:middle L:90%
really started back in 18 70 and that is when

353
00:17:21.339 --> 00:17:23.289 A:middle L:90%
the National Weather Service was started. It wasn't called

354
00:17:23.289 --> 00:17:26.089 A:middle L:90%
that. Then, uh, it was created by

355
00:17:26.099 --> 00:17:30.589 A:middle L:90%
the White House. Was President. Grant had been

356
00:17:30.589 --> 00:17:32.019 A:middle L:90%
part of the initiative to to create this thing,

357
00:17:32.019 --> 00:17:33.420 A:middle L:90%
But the idea was this. We finally had the

358
00:17:33.420 --> 00:17:37.920 A:middle L:90%
ability to communicate faster than horseback. We had not

359
00:17:37.920 --> 00:17:40.359 A:middle L:90%
just an extensive train network, but we had telegraphy

360
00:17:41.039 --> 00:17:44.369 A:middle L:90%
, so weather observations were being used to make predictions

361
00:17:44.369 --> 00:17:47.599 A:middle L:90%
, and it was finally possible to say, Hey

362
00:17:47.609 --> 00:17:48.980 A:middle L:90%
, there's a storm here in the West year in

363
00:17:48.980 --> 00:17:52.460 A:middle L:90%
the East you might want to batten down the hatches

364
00:17:52.940 --> 00:17:55.140 A:middle L:90%
. And so this, uh, system was put

365
00:17:55.140 --> 00:17:57.559 A:middle L:90%
together by by the federal government for the general good

366
00:17:57.559 --> 00:18:00.470 A:middle L:90%
, but more specifically for the good of commerce to

367
00:18:00.470 --> 00:18:03.829 A:middle L:90%
be able to make these weather forecasts, particularly for

368
00:18:03.829 --> 00:18:07.599 A:middle L:90%
ships at sea and in the Great Lakes that there

369
00:18:07.599 --> 00:18:10.470 A:middle L:90%
was so much to be gained by saying, I'm

370
00:18:10.470 --> 00:18:11.559 A:middle L:90%
no sailor, but I know that sometimes putting on

371
00:18:11.559 --> 00:18:12.930 A:middle L:90%
the storm, you either want to get into Porter

372
00:18:12.930 --> 00:18:15.359 A:middle L:90%
out of port. And I don't really know which

373
00:18:15.359 --> 00:18:18.400 A:middle L:90%
weather circumstances warrant switch. But if you can know

374
00:18:18.400 --> 00:18:21.450 A:middle L:90%
that in advance, you can prevent huge amounts of

375
00:18:21.450 --> 00:18:25.170 A:middle L:90%
losses. And so electronic data just wasn't possible until

376
00:18:25.480 --> 00:18:27.049 A:middle L:90%
calligraphy. So that's part of why it came around

377
00:18:27.049 --> 00:18:30.019 A:middle L:90%
at that point. Uh, but they did name

378
00:18:30.019 --> 00:18:33.660 A:middle L:90%
it the Division of Telegrams and Reports for the benefit

379
00:18:33.670 --> 00:18:36.400 A:middle L:90%
of Commerce. That was the name of this department

380
00:18:36.400 --> 00:18:37.440 A:middle L:90%
, which was rather inelegant. And it was run

381
00:18:37.440 --> 00:18:40.779 A:middle L:90%
by the military not because it was seen as a

382
00:18:40.779 --> 00:18:42.190 A:middle L:90%
military function, but because the federal and government,

383
00:18:42.200 --> 00:18:45.329 A:middle L:90%
in 18 70 was such that there was no agency

384
00:18:45.329 --> 00:18:49.390 A:middle L:90%
or department have their act together adequately to monitor and

385
00:18:49.390 --> 00:18:55.170 A:middle L:90%
relay this information with the proper structure and authority in

386
00:18:55.180 --> 00:18:56.890 A:middle L:90%
such a regulated fashion that could be relied on except

387
00:18:56.890 --> 00:18:59.599 A:middle L:90%
for the Army, which is why they had run

388
00:18:59.599 --> 00:19:00.640 A:middle L:90%
it like that. Now, if we include analog

389
00:19:00.650 --> 00:19:03.490 A:middle L:90%
data, then the census would surely be our first

390
00:19:03.500 --> 00:19:07.750 A:middle L:90%
open data thing in the United States as prescribed in

391
00:19:07.750 --> 00:19:11.029 A:middle L:90%
the Constitution. But that, as with our photo

392
00:19:11.029 --> 00:19:11.750 A:middle L:90%
of a spreadsheet, the sense is sort of the

393
00:19:11.750 --> 00:19:15.740 A:middle L:90%
equivalent of a photo of numbers as well. Until

394
00:19:15.740 --> 00:19:18.750 A:middle L:90%
those numbers could be provided as data, they couldn't

395
00:19:18.759 --> 00:19:21.220 A:middle L:90%
be shared or moved or changed, and so on

396
00:19:21.220 --> 00:19:23.670 A:middle L:90%
, in the same way. So this is actually

397
00:19:23.670 --> 00:19:26.480 A:middle L:90%
the description that established this this early weather agency,

398
00:19:26.779 --> 00:19:29.869 A:middle L:90%
and you know that they were first specifically to the

399
00:19:29.869 --> 00:19:32.049 A:middle L:90%
magnetic telegraph that was the Internet of that. It

400
00:19:32.049 --> 00:19:33.700 A:middle L:90%
was the Victorian Internet. If there's a book by

401
00:19:33.700 --> 00:19:36.420 A:middle L:90%
that name, the Victorian Internet, which I recommend

402
00:19:36.420 --> 00:19:38.059 A:middle L:90%
highly, uh, everything, all the things we

403
00:19:38.059 --> 00:19:41.440 A:middle L:90%
wonder about now and the crazy things people do over

404
00:19:41.440 --> 00:19:42.170 A:middle L:90%
the Internet, we're all a thing. Them,

405
00:19:42.539 --> 00:19:45.150 A:middle L:90%
you know, online dating was a thing over topography

406
00:19:45.150 --> 00:19:48.109 A:middle L:90%
in the 18 seventies. Emoji, we done did

407
00:19:48.109 --> 00:19:49.759 A:middle L:90%
that like all of these, you know, the

408
00:19:49.769 --> 00:19:53.750 A:middle L:90%
shorthand and the goofy acronyms. This all happened in

409
00:19:53.750 --> 00:19:56.799 A:middle L:90%
our country almost 120 years ago. That was a

410
00:19:56.809 --> 00:20:02.349 A:middle L:90%
major cultural shift. It started then, uh so

411
00:20:02.839 --> 00:20:04.799 A:middle L:90%
, uh, let's step forward to the 19 seventies

412
00:20:04.859 --> 00:20:07.789 A:middle L:90%
. 100 years later, that is when the National

413
00:20:07.789 --> 00:20:08.799 A:middle L:90%
Weather Service, as it was then named, established

414
00:20:08.799 --> 00:20:11.960 A:middle L:90%
their first computer bulletin board system of BBS. Just

415
00:20:11.960 --> 00:20:14.420 A:middle L:90%
quick show of hands. Who here has ever heard

416
00:20:14.420 --> 00:20:17.460 A:middle L:90%
of a BBS? Not bad. Who here has

417
00:20:17.460 --> 00:20:21.269 A:middle L:90%
ever used a BBS? Fewer? Okay with some

418
00:20:21.279 --> 00:20:22.220 A:middle L:90%
. So it's okay for the rest of you all

419
00:20:22.259 --> 00:20:26.089 A:middle L:90%
before we had open access to the Internet before there

420
00:20:26.089 --> 00:20:30.390 A:middle L:90%
was a World Wide Web. So the same concept

421
00:20:30.390 --> 00:20:32.769 A:middle L:90%
, except instead of typing a website address instead,

422
00:20:32.769 --> 00:20:33.799 A:middle L:90%
you had to first discover it existed. So you

423
00:20:33.799 --> 00:20:37.460 A:middle L:90%
get a magazine like board watch, which would email

424
00:20:37.460 --> 00:20:38.289 A:middle L:90%
you like, Imagine getting a magazine every month with

425
00:20:38.289 --> 00:20:41.019 A:middle L:90%
like, here are the new websites that exists.

426
00:20:41.019 --> 00:20:41.430 A:middle L:90%
This is all of them. We're going to review

427
00:20:41.430 --> 00:20:42.630 A:middle L:90%
them for you, tell you what's good and what's

428
00:20:42.630 --> 00:20:45.309 A:middle L:90%
bad. So we got to make it a magazine

429
00:20:45.309 --> 00:20:47.819 A:middle L:90%
like that. It would be a list of phone

430
00:20:47.819 --> 00:20:49.579 A:middle L:90%
numbers for every website basically And so you get a

431
00:20:49.579 --> 00:20:52.970 A:middle L:90%
phone number and you plug the phone into your computer

432
00:20:52.970 --> 00:20:53.940 A:middle L:90%
into the modem and you dial out to that long

433
00:20:53.940 --> 00:20:56.220 A:middle L:90%
distance number and your computer would make a series of

434
00:20:56.220 --> 00:20:59.670 A:middle L:90%
squawking sounds. And then you would occupy a phone

435
00:20:59.670 --> 00:21:00.740 A:middle L:90%
line. One phone line for that website, such

436
00:21:00.740 --> 00:21:03.640 A:middle L:90%
as it is, and your phone line would be

437
00:21:03.640 --> 00:21:06.069 A:middle L:90%
in use and over your text interface. You could

438
00:21:06.069 --> 00:21:08.240 A:middle L:90%
download whatever information you wanted. But as like as

439
00:21:08.240 --> 00:21:11.250 A:middle L:90%
goofy as that sounds, it was a wonderful system

440
00:21:11.259 --> 00:21:12.789 A:middle L:90%
, and this is fundamentally the same concept under which

441
00:21:12.789 --> 00:21:15.119 A:middle L:90%
the Internet came to exist. The Internet. Instead

442
00:21:15.119 --> 00:21:17.900 A:middle L:90%
, you would dial up just once, and then

443
00:21:17.900 --> 00:21:18.740 A:middle L:90%
you could connect anything else on the Internet. Use

444
00:21:18.740 --> 00:21:22.200 A:middle L:90%
the Internet instead of phone lines. Basically well,

445
00:21:22.200 --> 00:21:23.440 A:middle L:90%
The National Weather Service created O. B. B

446
00:21:23.440 --> 00:21:26.119 A:middle L:90%
. S in the 19 seventies, where you could

447
00:21:26.119 --> 00:21:27.890 A:middle L:90%
dial up to their system in D. C.

448
00:21:27.890 --> 00:21:30.799 A:middle L:90%
No. Two or two area code and you could

449
00:21:30.799 --> 00:21:33.460 A:middle L:90%
download weather data. You can get real time observations

450
00:21:33.460 --> 00:21:36.329 A:middle L:90%
. You get historical data as these big text files

451
00:21:36.339 --> 00:21:37.779 A:middle L:90%
where you would be able to enter. I know

452
00:21:37.779 --> 00:21:38.400 A:middle L:90%
if they used an airport code or what at that

453
00:21:38.400 --> 00:21:41.259 A:middle L:90%
point, But you can get a big file to

454
00:21:41.259 --> 00:21:42.099 A:middle L:90%
be like you up here. The observations right now

455
00:21:42.099 --> 00:21:44.279 A:middle L:90%
. Or, you know, as recently as they

456
00:21:44.289 --> 00:21:48.180 A:middle L:90%
as they had them there. And that started just

457
00:21:48.180 --> 00:21:51.940 A:middle L:90%
a total change in how we use weather data in

458
00:21:51.940 --> 00:21:53.690 A:middle L:90%
this country. So there are no, like private

459
00:21:53.690 --> 00:21:56.400 A:middle L:90%
weather satellites. I remember a congressman. Oh,

460
00:21:56.410 --> 00:21:59.920 A:middle L:90%
it must have been 2010 or 2011 with all the

461
00:21:59.930 --> 00:22:02.200 A:middle L:90%
tea party members got into the house, a congressman

462
00:22:02.200 --> 00:22:03.279 A:middle L:90%
had a great idea. What are we using the

463
00:22:03.279 --> 00:22:06.849 A:middle L:90%
National Weather Service for? We got all these private

464
00:22:06.849 --> 00:22:10.079 A:middle L:90%
weather things. Shut it down. That went poorly

465
00:22:10.089 --> 00:22:11.859 A:middle L:90%
. Uh, that's where all of our weather data

466
00:22:11.859 --> 00:22:12.420 A:middle L:90%
comes from. It all comes from national. There

467
00:22:12.420 --> 00:22:15.180 A:middle L:90%
are no private weather satellites circling the earth. All

468
00:22:15.180 --> 00:22:18.210 A:middle L:90%
of our weather data comes from National Weather Service.

469
00:22:18.210 --> 00:22:18.690 A:middle L:90%
So all you know, the Weather Channel and so

470
00:22:18.690 --> 00:22:21.319 A:middle L:90%
on went to this guy is like, uh,

471
00:22:21.329 --> 00:22:22.509 A:middle L:90%
you might not know this you're going to put us

472
00:22:22.509 --> 00:22:23.930 A:middle L:90%
all out of business will be the end of all

473
00:22:23.930 --> 00:22:26.970 A:middle L:90%
weather data. You don't seem to understand how the

474
00:22:26.980 --> 00:22:29.259 A:middle L:90%
private sector works. Please stop. That was the

475
00:22:29.259 --> 00:22:30.029 A:middle L:90%
end of that. It was like a day and

476
00:22:30.029 --> 00:22:33.799 A:middle L:90%
a half long movement in the house that ended very

477
00:22:33.799 --> 00:22:37.640 A:middle L:90%
poorly. But this is how this this the system

478
00:22:37.640 --> 00:22:38.470 A:middle L:90%
they set up in 19 seventies. Now it's basically

479
00:22:38.470 --> 00:22:41.529 A:middle L:90%
the same thing. It's just on the Web and

480
00:22:41.529 --> 00:22:44.069 A:middle L:90%
the data that is, that is found there that

481
00:22:44.069 --> 00:22:45.880 A:middle L:90%
you get when you, if you were to go

482
00:22:45.880 --> 00:22:48.400 A:middle L:90%
and who goes to the National Weather Service website to

483
00:22:48.400 --> 00:22:49.210 A:middle L:90%
get their forecast like that's not really a thing.

484
00:22:49.220 --> 00:22:52.740 A:middle L:90%
You can like the data. Is there that same

485
00:22:52.740 --> 00:22:55.569 A:middle L:90%
data? That's what that's what made possible the Weather

486
00:22:55.569 --> 00:22:56.519 A:middle L:90%
Channel. The fact that they could dial up.

487
00:22:56.519 --> 00:22:59.690 A:middle L:90%
I assume they probably kept their modem dialed out all

488
00:22:59.690 --> 00:23:00.380 A:middle L:90%
the time. But you know, they could download

489
00:23:00.380 --> 00:23:02.589 A:middle L:90%
for, like, weather on the age to the

490
00:23:02.589 --> 00:23:04.269 A:middle L:90%
National Weather Service like there's no human in putting that

491
00:23:04.269 --> 00:23:07.099 A:middle L:90%
together at the National Weather Service. They're downloaded that

492
00:23:07.099 --> 00:23:10.150 A:middle L:90%
data from the National Weather Service update and get every

493
00:23:10.150 --> 00:23:12.000 A:middle L:90%
few minutes and automatically generating that message that you see

494
00:23:12.000 --> 00:23:15.119 A:middle L:90%
locally on your television. That's how they did it

495
00:23:15.119 --> 00:23:17.079 A:middle L:90%
on the Weather Channel 20 years ago, and that's

496
00:23:17.079 --> 00:23:18.740 A:middle L:90%
how they do it now. So now we have

497
00:23:18.740 --> 00:23:22.259 A:middle L:90%
dozens of websites every T V and radio station that

498
00:23:22.259 --> 00:23:25.490 A:middle L:90%
has weather constantly. Like they're not employing meteorologist.

499
00:23:25.500 --> 00:23:26.299 A:middle L:90%
Some are. But by and large, they're getting

500
00:23:26.299 --> 00:23:29.640 A:middle L:90%
all their data straight from the National Weather Service.

501
00:23:29.640 --> 00:23:32.829 A:middle L:90%
I have no fewer than seven weather apps on my

502
00:23:32.829 --> 00:23:33.299 A:middle L:90%
phone. I don't know what's wrong with me.

503
00:23:33.309 --> 00:23:36.410 A:middle L:90%
It's really more of a confession here. Uh,

504
00:23:36.420 --> 00:23:38.730 A:middle L:90%
these are made possible by National Weather Service data.

505
00:23:38.730 --> 00:23:41.740 A:middle L:90%
They all pull their data ultimately directly from the same

506
00:23:41.740 --> 00:23:44.789 A:middle L:90%
website. The you and I can go look at

507
00:23:44.799 --> 00:23:45.970 A:middle L:90%
one of them. My favorite is called Dark Sky

508
00:23:47.549 --> 00:23:49.200 A:middle L:90%
Dark says Great. It's only job is to keep

509
00:23:49.200 --> 00:23:51.849 A:middle L:90%
track of where you are, and it tells you

510
00:23:51.849 --> 00:23:53.430 A:middle L:90%
it's gonna start raining soon. That's it. It

511
00:23:53.430 --> 00:23:55.880 A:middle L:90%
was 20 minutes. Gonna start raining. Oh,

512
00:23:55.880 --> 00:23:56.839 A:middle L:90%
great. I can go inside or go outside like

513
00:23:56.839 --> 00:23:59.690 A:middle L:90%
, depending on, like my gardening like, what

514
00:23:59.690 --> 00:24:03.250 A:middle L:90%
am I doing now? It's a great system that

515
00:24:03.250 --> 00:24:06.339 A:middle L:90%
would not be possible if we didn't have this data

516
00:24:06.349 --> 00:24:08.670 A:middle L:90%
system. Uh, the entire electric power industry relies

517
00:24:08.670 --> 00:24:11.000 A:middle L:90%
on this incredibly heavily. You need to know how

518
00:24:11.000 --> 00:24:14.109 A:middle L:90%
much coal you need to order. And by the

519
00:24:14.109 --> 00:24:15.990 A:middle L:90%
way, that doesn't show up overnight, like you

520
00:24:15.990 --> 00:24:18.160 A:middle L:90%
need to have lots of advance warning. So if

521
00:24:18.160 --> 00:24:18.980 A:middle L:90%
it's going to be really hot in July, you

522
00:24:18.980 --> 00:24:21.900 A:middle L:90%
got to place that order and those trains gonna come

523
00:24:21.900 --> 00:24:23.269 A:middle L:90%
a long way. You don't have too much coal

524
00:24:23.279 --> 00:24:25.869 A:middle L:90%
because we're gonna put it. They don't have a

525
00:24:25.869 --> 00:24:26.769 A:middle L:90%
whole lot of space in the hopper like you need

526
00:24:26.769 --> 00:24:30.460 A:middle L:90%
to keep it moving from the trains and burn it

527
00:24:30.940 --> 00:24:32.509 A:middle L:90%
. So if you order too much or too little

528
00:24:32.509 --> 00:24:33.930 A:middle L:90%
, it's a problem. And that is true for

529
00:24:33.930 --> 00:24:37.079 A:middle L:90%
the entire electric power industry. Are trucking fleets are

530
00:24:37.089 --> 00:24:38.329 A:middle L:90%
airlines? All of our just in time delivery.

531
00:24:38.339 --> 00:24:41.759 A:middle L:90%
Uh, using a really conservative estimate, Uh,

532
00:24:42.339 --> 00:24:45.609 A:middle L:90%
our weather data is worth about$10 billion a year

533
00:24:45.609 --> 00:24:47.700 A:middle L:90%
to the economy, Probably worth a lot more.

534
00:24:47.700 --> 00:24:49.059 A:middle L:90%
But using a pretty strict definition, that's$10 billion

535
00:24:49.079 --> 00:24:52.650 A:middle L:90%
GPS network, by the way.$90 billion which

536
00:24:52.650 --> 00:24:53.890 A:middle L:90%
, of course, I'm just using to tell me

537
00:24:53.890 --> 00:24:56.549 A:middle L:90%
it's going to rain soon. It's not really the

538
00:24:56.549 --> 00:24:59.410 A:middle L:90%
highest use of a fleet of satellites circling the Earth

539
00:24:59.410 --> 00:25:00.650 A:middle L:90%
at all times to tell me a great restaurant nearby

540
00:25:00.980 --> 00:25:03.960 A:middle L:90%
. But that's amazing data that we've used to open

541
00:25:03.960 --> 00:25:07.170 A:middle L:90%
up so we can use yelp and foursquare and Facebook

542
00:25:07.839 --> 00:25:10.049 A:middle L:90%
. So just to make a quick comparison here.

543
00:25:10.049 --> 00:25:11.460 A:middle L:90%
This is what we're seeing. This isn't data.

544
00:25:11.470 --> 00:25:15.230 A:middle L:90%
This is information. The only purpose when you see

545
00:25:15.240 --> 00:25:18.519 A:middle L:90%
the weather printed in newspaper, all you can do

546
00:25:18.519 --> 00:25:21.759 A:middle L:90%
with that is read it and say Yep. Now

547
00:25:21.759 --> 00:25:25.069 A:middle L:90%
I know the weather. It doesn't flow. It's

548
00:25:25.079 --> 00:25:26.789 A:middle L:90%
only in the form that it's in. So imagine

549
00:25:26.789 --> 00:25:27.279 A:middle L:90%
if this is how the U. S government released

550
00:25:27.279 --> 00:25:30.950 A:middle L:90%
weather data. Just they printed it every day was

551
00:25:30.950 --> 00:25:32.349 A:middle L:90%
just like it came to your front door and you

552
00:25:32.349 --> 00:25:33.609 A:middle L:90%
read it like, yeah, that's the weather.

553
00:25:33.819 --> 00:25:34.230 A:middle L:90%
There would be no apps. There would be no

554
00:25:34.230 --> 00:25:37.069 A:middle L:90%
dark sky to tell me that it's about to rain

555
00:25:37.089 --> 00:25:38.049 A:middle L:90%
. Trucking fleets couldn't use this. This would not

556
00:25:38.049 --> 00:25:40.970 A:middle L:90%
be nearly as useful to power industry and so on

557
00:25:41.240 --> 00:25:44.130 A:middle L:90%
. A few of the economic benefits that do exist

558
00:25:44.130 --> 00:25:45.190 A:middle L:90%
from weather data would exist if we had it as

559
00:25:45.200 --> 00:25:48.789 A:middle L:90%
information instead of data and certainly if it was an

560
00:25:48.799 --> 00:25:52.150 A:middle L:90%
open data. So this is data. This is

561
00:25:52.150 --> 00:25:55.400 A:middle L:90%
meet our meteorological terminal aviation routine. Weather report.

562
00:25:55.400 --> 00:25:56.599 A:middle L:90%
Can anybody here read meat are similar? Question.

563
00:25:56.599 --> 00:25:59.470 A:middle L:90%
Is anybody here an amateur pilot because it's the same

564
00:25:59.470 --> 00:26:02.730 A:middle L:90%
question? No. Okay, so anybody who often

565
00:26:02.730 --> 00:26:03.529 A:middle L:90%
there's one person in the audience was like, Yes

566
00:26:03.529 --> 00:26:06.500 A:middle L:90%
, and they're very eager because nobody ever asked them

567
00:26:06.500 --> 00:26:07.809 A:middle L:90%
if they can read guitar. So this is their

568
00:26:07.809 --> 00:26:11.210 A:middle L:90%
moment to shine. I can barely read it.

569
00:26:11.220 --> 00:26:12.490 A:middle L:90%
Um, so this is a standard that's been used

570
00:26:12.490 --> 00:26:15.069 A:middle L:90%
since 1968 for weather data. Anybody who's a pilot

571
00:26:15.069 --> 00:26:17.549 A:middle L:90%
, they look at this right before they get they

572
00:26:17.549 --> 00:26:18.150 A:middle L:90%
got on their flight. So this is whether data

573
00:26:18.150 --> 00:26:22.799 A:middle L:90%
that I grabbed yesterday for KBC be the closest airport

574
00:26:22.809 --> 00:26:26.490 A:middle L:90%
, and this tells us the time the wind speed

575
00:26:26.490 --> 00:26:29.849 A:middle L:90%
and knots, the direction of the wind, the

576
00:26:30.140 --> 00:26:32.769 A:middle L:90%
cloud ceiling, the temperature, all of that.

577
00:26:33.539 --> 00:26:37.740 A:middle L:90%
So the trick is, uh, in theory,

578
00:26:37.750 --> 00:26:38.940 A:middle L:90%
software can read this, and humans can read it

579
00:26:38.950 --> 00:26:41.539 A:middle L:90%
in practice. I don't know what this says,

580
00:26:41.549 --> 00:26:42.920 A:middle L:90%
and I suspect none of us do. So we've

581
00:26:42.920 --> 00:26:48.039 A:middle L:90%
started moving towards some better systems represented data. So

582
00:26:48.039 --> 00:26:48.950 A:middle L:90%
this is a system known as XML. I understand

583
00:26:48.950 --> 00:26:51.490 A:middle L:90%
this is probably too small for most of you to

584
00:26:51.490 --> 00:26:53.079 A:middle L:90%
read on the screen here, but XML stands for

585
00:26:53.079 --> 00:26:56.700 A:middle L:90%
extensible markup language. This is a means of sharing

586
00:26:56.700 --> 00:27:00.470 A:middle L:90%
data that can be read by both humans and by

587
00:27:00.470 --> 00:27:03.680 A:middle L:90%
software. Some argue that XML is useless to either

588
00:27:03.680 --> 00:27:06.220 A:middle L:90%
, because it's really a pain to work with.

589
00:27:06.230 --> 00:27:07.099 A:middle L:90%
But this is how weather data is provided. So

590
00:27:07.099 --> 00:27:10.220 A:middle L:90%
I went right to the National Weather Service's website.

591
00:27:10.569 --> 00:27:12.460 A:middle L:90%
I chose view source in my browser, and this

592
00:27:12.460 --> 00:27:15.750 A:middle L:90%
exactly what it showed me. Um, so what

593
00:27:15.750 --> 00:27:18.490 A:middle L:90%
it lists out here and like pretty plain English is

594
00:27:18.490 --> 00:27:23.579 A:middle L:90%
things like weather. Mostly cloudy temperature negative. 17.6

595
00:27:23.579 --> 00:27:26.849 A:middle L:90%
in Celsius, zero in in Fahrenheit while I got

596
00:27:26.849 --> 00:27:30.950 A:middle L:90%
that right at the right moment. 2020.7 mile per

597
00:27:30.950 --> 00:27:34.059 A:middle L:90%
hour wind, gust, wind chill all that stuff

598
00:27:34.069 --> 00:27:37.480 A:middle L:90%
. So these are the current observations that I was

599
00:27:37.480 --> 00:27:38.650 A:middle L:90%
looking out here. That's all available in a way

600
00:27:38.650 --> 00:27:41.059 A:middle L:90%
that software knows what to do with it. It's

601
00:27:41.059 --> 00:27:44.869 A:middle L:90%
trivial for a programmer to write software that will look

602
00:27:44.869 --> 00:27:45.650 A:middle L:90%
at this source and display at the top of the

603
00:27:45.650 --> 00:27:48.940 A:middle L:90%
newspaper website. Here is the weather right now in

604
00:27:48.940 --> 00:27:51.220 A:middle L:90%
our town, Or to use that to put the

605
00:27:51.220 --> 00:27:52.839 A:middle L:90%
weather up on the National Weather Service to say,

606
00:27:52.839 --> 00:27:53.990 A:middle L:90%
Here's the weather in your town right now, but

607
00:27:53.990 --> 00:27:56.900 A:middle L:90%
it's also pretty easy for humans to you to read

608
00:27:56.900 --> 00:27:57.519 A:middle L:90%
it. Not that I recommend getting your weather like

609
00:27:57.519 --> 00:28:00.240 A:middle L:90%
this, but this is how far we've come.

610
00:28:00.240 --> 00:28:02.519 A:middle L:90%
We've moved from meet our two x m l.

611
00:28:02.799 --> 00:28:03.789 A:middle L:90%
And this is what powers the app on my phone

612
00:28:03.789 --> 00:28:07.079 A:middle L:90%
. It's grabbing this exact data. If I looked

613
00:28:07.079 --> 00:28:07.819 A:middle L:90%
at the weather right now on my iPhone, it

614
00:28:07.819 --> 00:28:10.450 A:middle L:90%
would detect where I am. It would get this

615
00:28:10.450 --> 00:28:11.730 A:middle L:90%
exact data feed from that very airport because it's the

616
00:28:11.730 --> 00:28:15.109 A:middle L:90%
one closest to us. So the point I'm trying

617
00:28:15.109 --> 00:28:18.950 A:middle L:90%
to make with this deep dive into weather is that

618
00:28:18.440 --> 00:28:21.930 A:middle L:90%
we have so much that is powered by open data

619
00:28:21.930 --> 00:28:22.950 A:middle L:90%
in this country very quietly in a way that most

620
00:28:22.950 --> 00:28:26.420 A:middle L:90%
people don't need to be aware of in the background

621
00:28:26.420 --> 00:28:27.640 A:middle L:90%
, Open data is substantially making the world go round

622
00:28:27.640 --> 00:28:32.380 A:middle L:90%
. So federal election results open data, campaign finance

623
00:28:32.400 --> 00:28:34.569 A:middle L:90%
, open data, crime incidents, transit data is

624
00:28:34.569 --> 00:28:37.980 A:middle L:90%
y'all y'all heard about earlier address coordinates area code locations

625
00:28:37.990 --> 00:28:40.660 A:middle L:90%
on and on and on. All these systems are

626
00:28:40.660 --> 00:28:42.470 A:middle L:90%
impossible. Without open data, zip codes would be

627
00:28:42.470 --> 00:28:45.349 A:middle L:90%
pretty useless to anybody but the USPS if they weren't

628
00:28:45.349 --> 00:28:49.019 A:middle L:90%
open data. But because there it is, open

629
00:28:49.029 --> 00:28:52.109 A:middle L:90%
information, then that can be used by UPS and

630
00:28:52.109 --> 00:28:55.440 A:middle L:90%
FedEx and DHL. Anybody else to be able to

631
00:28:55.440 --> 00:28:56.910 A:middle L:90%
use the same structure to be able to move packages

632
00:28:56.910 --> 00:29:00.240 A:middle L:90%
around in the world. That's now compare that,

633
00:29:00.240 --> 00:29:00.609 A:middle L:90%
by the way, to the U. K.

634
00:29:00.619 --> 00:29:04.349 A:middle L:90%
Where address databases are considered proprietary shall the Postal Service

635
00:29:04.349 --> 00:29:07.630 A:middle L:90%
and they're not shared. That really prevents a lot

636
00:29:07.630 --> 00:29:10.230 A:middle L:90%
of private innovation. As a result, it's not

637
00:29:10.230 --> 00:29:12.410 A:middle L:90%
a great system. Our shipping and transit system is

638
00:29:12.410 --> 00:29:15.990 A:middle L:90%
totally dependent on open data in so many ways,

639
00:29:15.990 --> 00:29:17.910 A:middle L:90%
the ability to get import and export data to be

640
00:29:17.910 --> 00:29:21.400 A:middle L:90%
able to ability to share the what's the word.

641
00:29:21.400 --> 00:29:23.259 A:middle L:90%
They use the bill of Laden, the bill of

642
00:29:23.259 --> 00:29:26.369 A:middle L:90%
lading. That's it. That's the term of art

643
00:29:26.420 --> 00:29:27.759 A:middle L:90%
in shipping. There's a whole standard for the bill

644
00:29:27.759 --> 00:29:30.089 A:middle L:90%
of lading. So instead of reading these physical things

645
00:29:30.089 --> 00:29:33.210 A:middle L:90%
outside of every box on every great on on a

646
00:29:33.210 --> 00:29:37.670 A:middle L:90%
ship that's transmit electronically weather data, the real time

647
00:29:37.670 --> 00:29:41.529 A:middle L:90%
location of ships and title data and all of that

648
00:29:41.890 --> 00:29:44.900 A:middle L:90%
is needed for how we move things across the seas

649
00:29:44.900 --> 00:29:45.980 A:middle L:90%
, there's similar analogue for how we move things across

650
00:29:45.980 --> 00:29:48.849 A:middle L:90%
land. This is crucial. We also use open

651
00:29:48.849 --> 00:29:51.420 A:middle L:90%
data in a bunch of places where you really wouldn't

652
00:29:51.420 --> 00:29:52.579 A:middle L:90%
think you'd find it. So my favorite example of

653
00:29:52.579 --> 00:29:56.329 A:middle L:90%
private open data is the stock market. There is

654
00:29:56.339 --> 00:30:00.849 A:middle L:90%
no higher example of open data being used outside of

655
00:30:00.849 --> 00:30:02.259 A:middle L:90%
government. In the stock market, stock market is

656
00:30:02.259 --> 00:30:04.420 A:middle L:90%
nothing but a huge pile of open data. Here's

657
00:30:04.420 --> 00:30:07.059 A:middle L:90%
what people are paying for this stock right now,

658
00:30:07.640 --> 00:30:10.670 A:middle L:90%
and here's what they paid yesterday and last week and

659
00:30:11.039 --> 00:30:14.289 A:middle L:90%
10 seconds ago or 10 milliseconds. It goes as

660
00:30:14.289 --> 00:30:15.569 A:middle L:90%
the case may be, that's all they do.

661
00:30:15.579 --> 00:30:18.440 A:middle L:90%
That's their job. Uh, it used to be

662
00:30:18.440 --> 00:30:21.099 A:middle L:90%
much more complicated. Now it's, here's the data

663
00:30:21.099 --> 00:30:22.809 A:middle L:90%
, and and there's a the ability to make trades

664
00:30:22.809 --> 00:30:23.509 A:middle L:90%
on there as well. But fundamentally, that's what

665
00:30:23.509 --> 00:30:26.349 A:middle L:90%
they trade in. Is that data, uh,

666
00:30:26.740 --> 00:30:30.190 A:middle L:90%
crucial? If we didn't have Imagine if NASDAQ's just

667
00:30:30.190 --> 00:30:32.190 A:middle L:90%
said, Yeah, we're shutting it down. We're

668
00:30:32.190 --> 00:30:36.740 A:middle L:90%
not providing stock date anymore. The time that NASDAQ

669
00:30:36.740 --> 00:30:38.670 A:middle L:90%
would continue to exist, we measured in hours because

670
00:30:38.670 --> 00:30:42.109 A:middle L:90%
why, what purpose do they serve anymore? Product

671
00:30:42.109 --> 00:30:45.150 A:middle L:90%
recalls. That's all. Open data. It's going

672
00:30:45.150 --> 00:30:48.380 A:middle L:90%
to be open data. OSHA fatality reports real time

673
00:30:48.380 --> 00:30:51.180 A:middle L:90%
earthquake data. There's an earthquake anywhere in this world

674
00:30:51.190 --> 00:30:52.269 A:middle L:90%
you're going to see it on. I think it's

675
00:30:52.269 --> 00:30:55.910 A:middle L:90%
an earthquake, Gov. Within about 10 to 30

676
00:30:55.910 --> 00:30:57.680 A:middle L:90%
seconds. So when we had a big earthquake just

677
00:30:57.680 --> 00:31:00.579 A:middle L:90%
a few miles from my house a few years ago

678
00:31:00.809 --> 00:31:03.420 A:middle L:90%
, and we had aftershocks pretty regularly. Routinely,

679
00:31:03.420 --> 00:31:04.940 A:middle L:90%
I think, was that an earthquake? Let me

680
00:31:04.940 --> 00:31:07.750 A:middle L:90%
check pulled off my iPhone. That's how I found

681
00:31:07.750 --> 00:31:08.140 A:middle L:90%
out there was an earthquake. By the time I

682
00:31:08.140 --> 00:31:11.740 A:middle L:90%
get to the USGS website, they know I know

683
00:31:11.740 --> 00:31:12.019 A:middle L:90%
how strong it is, Where it was centered,

684
00:31:12.019 --> 00:31:15.599 A:middle L:90%
how deep it is, that's magical, the real

685
00:31:15.599 --> 00:31:18.329 A:middle L:90%
time location of every airplane and cargo ship or even

686
00:31:18.329 --> 00:31:21.390 A:middle L:90%
large ship in the world. That's all available all

687
00:31:21.390 --> 00:31:22.319 A:middle L:90%
the time. It's amazing when I go to the

688
00:31:22.319 --> 00:31:25.380 A:middle L:90%
beach and go to the other banks and I see

689
00:31:25.380 --> 00:31:26.670 A:middle L:90%
ships sailing by in the horizon. I just pull

690
00:31:26.670 --> 00:31:29.410 A:middle L:90%
up one of the cargo tracking ships. So yeah

691
00:31:29.420 --> 00:31:30.099 A:middle L:90%
, that's what's going from that's where it's going to

692
00:31:30.099 --> 00:31:32.269 A:middle L:90%
. Here's a picture of it. Here's what it's

693
00:31:32.269 --> 00:31:33.700 A:middle L:90%
carrying. It's really cool now for me, it's

694
00:31:33.700 --> 00:31:36.269 A:middle L:90%
just really cool. If you're in that industry,

695
00:31:36.440 --> 00:31:38.390 A:middle L:90%
it's essential. Uh, one of my favorite weird

696
00:31:38.390 --> 00:31:41.309 A:middle L:90%
examples of open data is about a year ago,

697
00:31:41.319 --> 00:31:44.519 A:middle L:90%
Marvel, the comic company. They created what's known

698
00:31:44.519 --> 00:31:47.000 A:middle L:90%
as an API, an application programming interfaces like open

699
00:31:47.000 --> 00:31:48.049 A:middle L:90%
data. But instead of downloading a file, you

700
00:31:48.049 --> 00:31:51.900 A:middle L:90%
just ask for information about one thing like, What

701
00:31:51.900 --> 00:31:52.960 A:middle L:90%
is the weather in this place? Here it is

702
00:31:53.339 --> 00:31:56.430 A:middle L:90%
, uh, about Marvel Comics. So you know

703
00:31:56.440 --> 00:32:00.440 A:middle L:90%
, Spiderman and X men and so on, atomized

704
00:32:00.450 --> 00:32:01.799 A:middle L:90%
down to every tiny little element of every issue.

705
00:32:01.809 --> 00:32:05.660 A:middle L:90%
So you can say I would like a list of

706
00:32:05.670 --> 00:32:07.500 A:middle L:90%
every issue of Spider Man from the 19 seventies that

707
00:32:07.500 --> 00:32:10.460 A:middle L:90%
Stanley is on the credits for writing returned instantly.

708
00:32:12.240 --> 00:32:15.380 A:middle L:90%
Or who was the artist? Who's the artist who

709
00:32:15.390 --> 00:32:20.380 A:middle L:90%
drew the most covers between 1985 and 1987? Answer

710
00:32:20.380 --> 00:32:22.990 A:middle L:90%
right away. Now this sounds a little goofy,

711
00:32:22.000 --> 00:32:25.329 A:middle L:90%
but it's not really comic. Book collecting is huge

712
00:32:25.339 --> 00:32:28.980 A:middle L:90%
. I'd be very surprised if, UH Newman are

713
00:32:28.980 --> 00:32:30.650 A:middle L:90%
certainly library system at Virginia Tech didn't have a decent

714
00:32:30.650 --> 00:32:34.450 A:middle L:90%
collection of comics at this point. Uh, whatever

715
00:32:34.450 --> 00:32:37.519 A:middle L:90%
concept might have existed, decades about comics is like

716
00:32:37.529 --> 00:32:38.380 A:middle L:90%
, you know, Donald Duck is things for kids

717
00:32:38.440 --> 00:32:42.910 A:middle L:90%
. It's become serious literature collectors of these things.

718
00:32:42.920 --> 00:32:45.039 A:middle L:90%
They want to know what's in every issue so they

719
00:32:45.049 --> 00:32:46.779 A:middle L:90%
might use. There's a lot of software you can

720
00:32:46.779 --> 00:32:51.140 A:middle L:90%
get, like for desktop computer that can track issues

721
00:32:51.140 --> 00:32:52.289 A:middle L:90%
of, uh, the issues of comics that you

722
00:32:52.289 --> 00:32:55.009 A:middle L:90%
have if you can interface that as they do now

723
00:32:55.009 --> 00:32:58.740 A:middle L:90%
with Marvel's AP Well, it's not like you now

724
00:32:58.740 --> 00:33:00.549 A:middle L:90%
. Just used to be You just know the title

725
00:33:00.640 --> 00:33:01.849 A:middle L:90%
and maybe the date that it came out. Now

726
00:33:01.859 --> 00:33:06.150 A:middle L:90%
you're boring. Database of your collection of comics is

727
00:33:06.150 --> 00:33:07.809 A:middle L:90%
incredibly rich. The amount that you know. So

728
00:33:07.809 --> 00:33:09.789 A:middle L:90%
if you're wondering like, what was that one issue

729
00:33:09.789 --> 00:33:14.049 A:middle L:90%
where you don't need to wonder anymore answered, solved

730
00:33:14.839 --> 00:33:16.140 A:middle L:90%
. So they opened. It is in use in

731
00:33:16.140 --> 00:33:19.940 A:middle L:90%
all sorts of interesting places. Now, unfortunately,

732
00:33:19.950 --> 00:33:25.339 A:middle L:90%
not enough places Legislative data no often just doesn't exist

733
00:33:25.559 --> 00:33:29.420 A:middle L:90%
. Laws. The laws that govern us know usually

734
00:33:29.420 --> 00:33:30.589 A:middle L:90%
just not available as open data municipal G. I

735
00:33:30.589 --> 00:33:34.289 A:middle L:90%
s data registered corporations, restaurant inspections i. R

736
00:33:34.289 --> 00:33:37.400 A:middle L:90%
s nine nineties FCC political ad spending. There are

737
00:33:37.400 --> 00:33:39.650 A:middle L:90%
so many really great examples of incredibly valuable data that

738
00:33:39.650 --> 00:33:44.250 A:middle L:90%
either isn't available responsibly available, and you'll note that

739
00:33:44.250 --> 00:33:46.059 A:middle L:90%
it's all really crucial stuff. That's not a coincidence

740
00:33:46.339 --> 00:33:50.599 A:middle L:90%
. As a rule, the more valuable data is

741
00:33:50.609 --> 00:33:52.519 A:middle L:90%
, the more politically sensitive it is, and the

742
00:33:52.519 --> 00:33:53.599 A:middle L:90%
more likely it is, will be some reasons joined

743
00:33:53.599 --> 00:33:55.470 A:middle L:90%
up as to why that it shouldn't be released as

744
00:33:55.470 --> 00:33:59.190 A:middle L:90%
open data. Some of those concerns are legitimate.

745
00:33:59.200 --> 00:34:00.920 A:middle L:90%
A lot of it is often concern trolling. They're

746
00:34:00.920 --> 00:34:04.119 A:middle L:90%
not really concerned. It's just somebody doesn't want to

747
00:34:04.119 --> 00:34:06.740 A:middle L:90%
publish it. In fact, I just posed sitting

748
00:34:06.740 --> 00:34:08.050 A:middle L:90%
here reading, Reading Twitter is waiting for for us

749
00:34:08.050 --> 00:34:09.510 A:middle L:90%
to start here. About an hour ago, I

750
00:34:09.510 --> 00:34:14.269 A:middle L:90%
saw that two members of the General Assembly who also

751
00:34:14.280 --> 00:34:16.570 A:middle L:90%
holds state jobs. In fact, I saw Megan

752
00:34:16.570 --> 00:34:17.909 A:middle L:90%
Ryan in the back there. Quote in the article

753
00:34:17.920 --> 00:34:22.869 A:middle L:90%
, who also holds state jobs, refused to say

754
00:34:22.880 --> 00:34:24.699 A:middle L:90%
how much they're being paid at their state jobs.

755
00:34:25.730 --> 00:34:29.619 A:middle L:90%
These are members of the General Assembly who are also

756
00:34:29.630 --> 00:34:31.070 A:middle L:90%
on the state payroll, who say it's a secret

757
00:34:31.079 --> 00:34:34.489 A:middle L:90%
how much they're paid now. I don't know many

758
00:34:34.489 --> 00:34:37.159 A:middle L:90%
people who believe that it should be a secret how

759
00:34:37.159 --> 00:34:39.199 A:middle L:90%
much state employees are paid. If anything should be

760
00:34:39.199 --> 00:34:42.300 A:middle L:90%
public, it seems like to be that doubly so

761
00:34:42.300 --> 00:34:44.840 A:middle L:90%
if you're also a legislator, Tripoli. So if

762
00:34:44.840 --> 00:34:45.940 A:middle L:90%
you are in the case of one of these,

763
00:34:45.940 --> 00:34:47.679 A:middle L:90%
a legislator who has argued for ethics reform that we

764
00:34:47.679 --> 00:34:51.789 A:middle L:90%
need to have more of this data available that's not

765
00:34:51.789 --> 00:34:53.349 A:middle L:90%
available as open data, why? I think we

766
00:34:53.349 --> 00:34:55.849 A:middle L:90%
see why we have powerful members of the General Assembly

767
00:34:55.849 --> 00:34:59.889 A:middle L:90%
who don't want things known about them personally, and

768
00:34:59.889 --> 00:35:02.389 A:middle L:90%
this is not a great system. So it's one

769
00:35:02.389 --> 00:35:05.510 A:middle L:90%
thing to persuade people in a room like this why

770
00:35:05.510 --> 00:35:07.510 A:middle L:90%
government should open its data and why it's a good

771
00:35:07.510 --> 00:35:09.639 A:middle L:90%
idea. It's quite another to persuade government of the

772
00:35:09.639 --> 00:35:13.289 A:middle L:90%
same. So we hear a lot of failing arguments

773
00:35:13.460 --> 00:35:15.989 A:middle L:90%
. Some of these arguments can be successful in riling

774
00:35:15.989 --> 00:35:17.289 A:middle L:90%
up the crowd. Some of them can be successful

775
00:35:17.289 --> 00:35:21.119 A:middle L:90%
to persuade elected officials of the merits of it,

776
00:35:21.340 --> 00:35:22.860 A:middle L:90%
but not so successful at actually getting data published.

777
00:35:23.039 --> 00:35:25.349 A:middle L:90%
So one of them is we paid for it.

778
00:35:27.239 --> 00:35:30.119 A:middle L:90%
Uh, government has a fundamental obligation to be transparent

779
00:35:30.119 --> 00:35:31.420 A:middle L:90%
. There's another one. It has economic value.

780
00:35:31.429 --> 00:35:35.090 A:middle L:90%
We have a right to it. These are all

781
00:35:35.099 --> 00:35:37.260 A:middle L:90%
sound and ineffective arguments. Uh, and when I'm

782
00:35:37.260 --> 00:35:39.630 A:middle L:90%
speaking to audiences include a lot of people in government

783
00:35:39.760 --> 00:35:43.550 A:middle L:90%
. There are lot of nodding heads because they know

784
00:35:43.550 --> 00:35:45.900 A:middle L:90%
that somebody walking into their office and saying, I

785
00:35:45.900 --> 00:35:47.670 A:middle L:90%
demand that you published data because I'm an American and

786
00:35:47.670 --> 00:35:51.119 A:middle L:90%
the Constitution says you have to do this because it's

787
00:35:51.130 --> 00:35:52.460 A:middle L:90%
information it's public and FOIA and give it to me

788
00:35:53.139 --> 00:35:55.260 A:middle L:90%
like you're gonna find every reason you can not to

789
00:35:55.260 --> 00:35:58.039 A:middle L:90%
give them that data like that's just rude, and

790
00:35:58.039 --> 00:35:59.489 A:middle L:90%
people don't want to be treated like that. And

791
00:35:59.489 --> 00:36:00.849 A:middle L:90%
that's often how many of these arguments boil down.

792
00:36:01.429 --> 00:36:05.489 A:middle L:90%
What is really healthier is regarding open data is being

793
00:36:05.489 --> 00:36:07.329 A:middle L:90%
useful for governments. The government should produce data to

794
00:36:07.329 --> 00:36:12.650 A:middle L:90%
share between agencies between levels of government because that is

795
00:36:12.650 --> 00:36:14.699 A:middle L:90%
a better way for government to work and more on

796
00:36:14.699 --> 00:36:15.900 A:middle L:90%
this later. But the gist of it is that

797
00:36:15.909 --> 00:36:19.949 A:middle L:90%
if open data can make the jobs and, like

798
00:36:19.949 --> 00:36:22.289 A:middle L:90%
personal lives of people in government, easier, then

799
00:36:22.289 --> 00:36:24.159 A:middle L:90%
of course it's going to happen. Why wouldn't it

800
00:36:24.159 --> 00:36:27.320 A:middle L:90%
happen? But if it makes their jobs harder to

801
00:36:27.320 --> 00:36:29.329 A:middle L:90%
the extent to which it makes them likely to be

802
00:36:29.329 --> 00:36:31.679 A:middle L:90%
hauled before a Senate subcommittee to testify as to how

803
00:36:31.679 --> 00:36:35.219 A:middle L:90%
inaccurate data got published, then they're probably not going

804
00:36:35.219 --> 00:36:37.130 A:middle L:90%
to do it. Uh, but as a happy

805
00:36:37.130 --> 00:36:38.510 A:middle L:90%
byproduct of this idea, that government should share data

806
00:36:38.510 --> 00:36:40.349 A:middle L:90%
within government. The rest of us get this data

807
00:36:40.349 --> 00:36:44.869 A:middle L:90%
to so the first story here legislative video because I

808
00:36:44.869 --> 00:36:46.789 A:middle L:90%
love this story and because a reporter is actually working

809
00:36:46.789 --> 00:36:49.650 A:middle L:90%
on a story for about a Monday, I think

810
00:36:49.650 --> 00:36:51.139 A:middle L:90%
for The Virginian Pilot. So I had to tell

811
00:36:51.139 --> 00:36:52.159 A:middle L:90%
him this story about it a few days ago,

812
00:36:52.230 --> 00:36:54.400 A:middle L:90%
this high pride video out of the hands of the

813
00:36:54.400 --> 00:36:58.449 A:middle L:90%
Virginia General Assembly to make them open up their data

814
00:36:58.460 --> 00:37:00.449 A:middle L:90%
. So the Legislature streams video of its floor sessions

815
00:37:00.449 --> 00:37:01.750 A:middle L:90%
, not committee meetings. You're out of luck there

816
00:37:01.750 --> 00:37:05.650 A:middle L:90%
if you are not at their pre dawn meetings in

817
00:37:05.650 --> 00:37:07.570 A:middle L:90%
Richmond, and that's a long drive from here.

818
00:37:07.579 --> 00:37:09.070 A:middle L:90%
If you don't know. And also Virginia goes a

819
00:37:09.070 --> 00:37:12.139 A:middle L:90%
lot farther Southwest, so it's a multi day trip

820
00:37:12.139 --> 00:37:13.619 A:middle L:90%
for a lot of people. If you're not in

821
00:37:13.619 --> 00:37:16.130 A:middle L:90%
the literally predawn meetings in these rooms in the capital

822
00:37:16.130 --> 00:37:19.090 A:middle L:90%
, then you don't get to see them. But

823
00:37:19.360 --> 00:37:21.539 A:middle L:90%
it turns out that they're floor sessions. They stream

824
00:37:21.539 --> 00:37:25.150 A:middle L:90%
them live on the Web and they're saved to DVD

825
00:37:25.630 --> 00:37:28.840 A:middle L:90%
. They capture, they burn them live to DVDs

826
00:37:28.840 --> 00:37:29.969 A:middle L:90%
. So if it goes along, they have to

827
00:37:29.969 --> 00:37:30.480 A:middle L:90%
, like, switch out DVDs and you miss a

828
00:37:30.480 --> 00:37:32.289 A:middle L:90%
few seconds. Um, it's not a great system

829
00:37:32.289 --> 00:37:34.869 A:middle L:90%
, but that's how they do it. You can't

830
00:37:34.869 --> 00:37:37.519 A:middle L:90%
get archived video anywhere unless you go to the General

831
00:37:37.519 --> 00:37:40.230 A:middle L:90%
Assembly and you pay them for the DVDs. So

832
00:37:40.230 --> 00:37:45.179 A:middle L:90%
I learned this in oh, 2008 January 2000 and

833
00:37:45.179 --> 00:37:46.909 A:middle L:90%
eight. I thought this might be fun. So

834
00:37:46.909 --> 00:37:49.239 A:middle L:90%
I went to the General Assembly, said, Hey

835
00:37:49.250 --> 00:37:51.349 A:middle L:90%
, I'd like the DVD for yesterday's session to I

836
00:37:51.349 --> 00:37:52.820 A:middle L:90%
forgot it was the House or Senate clerk, So

837
00:37:52.820 --> 00:37:53.929 A:middle L:90%
I was like, No House clerk's office. Sure

838
00:37:54.019 --> 00:37:58.489 A:middle L:90%
, that'll be$10 okay for$10 and we waited

839
00:37:58.500 --> 00:38:00.050 A:middle L:90%
. I think few hours later they had the disk

840
00:38:00.050 --> 00:38:00.489 A:middle L:90%
ready for me and I took it home and I

841
00:38:00.489 --> 00:38:01.750 A:middle L:90%
ripped it, and I put it up on YouTube

842
00:38:02.420 --> 00:38:05.099 A:middle L:90%
and I did it again the next day, and

843
00:38:05.099 --> 00:38:07.019 A:middle L:90%
the third day I was told I was not allowed

844
00:38:07.019 --> 00:38:08.949 A:middle L:90%
to get the video anymore, and I wasn't quite

845
00:38:08.949 --> 00:38:10.630 A:middle L:90%
sure what the story was. But I talked to

846
00:38:10.630 --> 00:38:12.849 A:middle L:90%
a friend of mine, the Virginia A C L

847
00:38:12.849 --> 00:38:14.369 A:middle L:90%
U. They kicked me up to the national issue

848
00:38:14.369 --> 00:38:15.460 A:middle L:90%
. Who sent a threatening letter to I gather the

849
00:38:15.460 --> 00:38:17.530 A:middle L:90%
majority leader of the house, and then I could

850
00:38:17.530 --> 00:38:21.429 A:middle L:90%
get the video again. Uh, and I've been

851
00:38:21.429 --> 00:38:24.460 A:middle L:90%
doing it ever since. I get these stacks of

852
00:38:24.460 --> 00:38:27.699 A:middle L:90%
DVDs in the mail. It cost me$800 per

853
00:38:27.699 --> 00:38:30.300 A:middle L:90%
session. I buy all of the video one by

854
00:38:30.300 --> 00:38:34.090 A:middle L:90%
one, I load the DVDs into my eye mask

855
00:38:34.099 --> 00:38:36.179 A:middle L:90%
I ripped the video and I upload it and make

856
00:38:36.179 --> 00:38:37.059 A:middle L:90%
it available for free so that nobody else has to

857
00:38:37.059 --> 00:38:38.989 A:middle L:90%
buy the video. And the crazy thing is,

858
00:38:38.989 --> 00:38:42.269 A:middle L:90%
I started doing this in 2000 and eight. I'm

859
00:38:42.269 --> 00:38:44.260 A:middle L:90%
still doing it. I thought I'd do it for

860
00:38:44.260 --> 00:38:45.210 A:middle L:90%
a year and they be like, Oh, this

861
00:38:45.210 --> 00:38:46.139 A:middle L:90%
is Let's just start doing it. If this dude

862
00:38:46.139 --> 00:38:49.360 A:middle L:90%
can do it in his spare time, we can

863
00:38:49.360 --> 00:38:52.130 A:middle L:90%
probably manage it. As the oldest legislative body in

864
00:38:52.130 --> 00:38:52.809 A:middle L:90%
the hemisphere, I think we can probably swing this

865
00:38:53.119 --> 00:38:55.300 A:middle L:90%
. I was wrong. I vastly underestimated it.

866
00:38:55.309 --> 00:38:58.630 A:middle L:90%
So I upload all this video to the Internet archive

867
00:38:58.639 --> 00:39:00.110 A:middle L:90%
at archive dot org to be hosted permanently and for

868
00:39:00.110 --> 00:39:01.409 A:middle L:90%
free, which is great for me. There is

869
00:39:01.420 --> 00:39:04.889 A:middle L:90%
other than buying the freaking video, which is crazy

870
00:39:04.900 --> 00:39:06.760 A:middle L:90%
. Uh, there's no cost. I've tried by

871
00:39:06.760 --> 00:39:07.090 A:middle L:90%
the way to like, say, can you just

872
00:39:07.090 --> 00:39:08.719 A:middle L:90%
give it to me in like a memory sticks?

873
00:39:08.719 --> 00:39:12.000 A:middle L:90%
I don't need to rip. These are like upload

874
00:39:12.000 --> 00:39:14.079 A:middle L:90%
it so I could download it or like that would

875
00:39:14.079 --> 00:39:15.570 A:middle L:90%
be great. But no, they only have the

876
00:39:15.570 --> 00:39:17.250 A:middle L:90%
DVDs, and they will not use any other system

877
00:39:17.260 --> 00:39:21.019 A:middle L:90%
other than copying DVDs and I have these, my

878
00:39:21.030 --> 00:39:22.949 A:middle L:90%
, like cleaning out my attic a few weeks ago

879
00:39:22.619 --> 00:39:24.820 A:middle L:90%
. Like I just have, like, boxes of

880
00:39:24.820 --> 00:39:27.969 A:middle L:90%
Devi like, full of DVDs. I had to

881
00:39:27.969 --> 00:39:29.599 A:middle L:90%
throw them away. Like, I guess I told

882
00:39:29.599 --> 00:39:30.159 A:middle L:90%
him to the State Library. I don't know.

883
00:39:30.170 --> 00:39:34.139 A:middle L:90%
This is just This is a crazy system. So

884
00:39:34.139 --> 00:39:36.840 A:middle L:90%
here's a short clip of what that video looks like

885
00:39:36.849 --> 00:39:40.730 A:middle L:90%
. Seven of 473. Build a repeal and re

886
00:39:40.730 --> 00:39:45.389 A:middle L:90%
enact various sections of Virginia uniform. Foreign country money

887
00:39:45.400 --> 00:39:49.829 A:middle L:90%
judgments, recognition act. Exciting stuff with an amendment

888
00:39:50.400 --> 00:39:52.110 A:middle L:90%
. The John Sailing, Mr. Speaker, I

889
00:39:52.110 --> 00:39:54.360 A:middle L:90%
move the committee amendment questions on Adoption Committee of Atmosphere

890
00:39:54.590 --> 00:39:57.389 A:middle L:90%
that emotional say, I know the graphic in the

891
00:39:57.389 --> 00:40:00.309 A:middle L:90%
top right of the screen There, gentlemen. Thank

892
00:40:00.309 --> 00:40:00.940 A:middle L:90%
you, Mr Speaker. Members of the Senate And

893
00:40:00.940 --> 00:40:05.360 A:middle L:90%
now one of the three replaces the 1990 uniform Foreign

894
00:40:05.360 --> 00:40:08.420 A:middle L:90%
country, Money, Judgment Recognition Act and modernized process

895
00:40:08.420 --> 00:40:10.389 A:middle L:90%
for his mouth is out of sync with the audio

896
00:40:10.599 --> 00:40:14.440 A:middle L:90%
uniform. Law Council will be about the 20th ST

897
00:40:14.639 --> 00:40:16.110 A:middle L:90%
without the bill. You get the idea. So

898
00:40:16.119 --> 00:40:20.639 A:middle L:90%
, um, there are three bits of metadata on

899
00:40:20.639 --> 00:40:22.699 A:middle L:90%
the screen right now that I use. So the

900
00:40:22.699 --> 00:40:24.190 A:middle L:90%
first is each of the Chiron's. That's the industry

901
00:40:24.190 --> 00:40:27.489 A:middle L:90%
term that's used for the graphics. So one in

902
00:40:27.489 --> 00:40:30.179 A:middle L:90%
the corner identifies the bill that's being discussed right now

903
00:40:30.190 --> 00:40:31.610 A:middle L:90%
, and one identifies the person who's speaking. And

904
00:40:31.610 --> 00:40:35.659 A:middle L:90%
the third piece of metadata is delicate. Habib's face

905
00:40:35.670 --> 00:40:37.739 A:middle L:90%
There's 1/4 we can't see. That's the audio,

906
00:40:37.809 --> 00:40:40.210 A:middle L:90%
his voice pitch in his actual words. So I

907
00:40:40.210 --> 00:40:44.329 A:middle L:90%
use all five of those two index every second video

908
00:40:44.710 --> 00:40:45.579 A:middle L:90%
. So I once a second I look for these

909
00:40:45.579 --> 00:40:47.230 A:middle L:90%
. Kyra, Kyra is in the video, and

910
00:40:47.230 --> 00:40:50.940 A:middle L:90%
if I find them, I, uh, grab

911
00:40:50.940 --> 00:40:52.210 A:middle L:90%
that graphic I transform into black and white. I

912
00:40:52.210 --> 00:40:54.809 A:middle L:90%
inverse the colors and stretch the contrast. And then

913
00:40:54.809 --> 00:40:59.190 A:middle L:90%
I run it through OCR software to turn it into

914
00:40:59.190 --> 00:41:00.860 A:middle L:90%
text. And then I load it into the database

915
00:41:00.869 --> 00:41:02.929 A:middle L:90%
with a time stamp of the video, the day

916
00:41:02.940 --> 00:41:06.329 A:middle L:90%
, the session and the second in which that happened

917
00:41:06.510 --> 00:41:07.420 A:middle L:90%
. And I do some transformations. The database to

918
00:41:07.420 --> 00:41:10.099 A:middle L:90%
set to say Okay, delegate Gregory Hobby has this

919
00:41:10.099 --> 00:41:13.469 A:middle L:90%
idea. Is this legislator He's in this district.

920
00:41:13.480 --> 00:41:15.019 A:middle L:90%
Make sure that the name is accurate, that the

921
00:41:15.019 --> 00:41:16.329 A:middle L:90%
bill exists and so on. Um, and as

922
00:41:16.329 --> 00:41:19.489 A:middle L:90%
a result, I wind up with this is a

923
00:41:19.489 --> 00:41:22.050 A:middle L:90%
site I run about the General Assembly. Uh so

924
00:41:22.059 --> 00:41:22.809 A:middle L:90%
So here. Here's where the bill winds up.

925
00:41:22.820 --> 00:41:24.219 A:middle L:90%
The data winds up being used. So this bill

926
00:41:24.219 --> 00:41:27.380 A:middle L:90%
says that Quantico is allowed to set its own speed

927
00:41:27.380 --> 00:41:29.980 A:middle L:90%
limits. So imagine if you wanted to learn about

928
00:41:29.980 --> 00:41:32.030 A:middle L:90%
why delegate Dudenhoeffer introduced this bill. Well, because

929
00:41:32.030 --> 00:41:34.880 A:middle L:90%
we know the times in which the bill was discussed

930
00:41:34.880 --> 00:41:37.460 A:middle L:90%
, we can dynamically create an excerpt so you could

931
00:41:37.460 --> 00:41:39.510 A:middle L:90%
just watch the video and listen to the author of

932
00:41:39.510 --> 00:41:43.780 A:middle L:90%
the bill. Explain it. Uhh why the video

933
00:41:43.780 --> 00:41:45.820 A:middle L:90%
is not playing. For some reason, this can

934
00:41:45.820 --> 00:41:47.179 A:middle L:90%
be very dramatic. Anyway, uh, so,

935
00:41:47.190 --> 00:41:51.719 A:middle L:90%
uh, you would hear their delicate never explaining why

936
00:41:51.730 --> 00:41:52.420 A:middle L:90%
it's necessary for you to go to have that exemption

937
00:41:52.420 --> 00:41:54.769 A:middle L:90%
. I never call it being very persuasive. But

938
00:41:54.769 --> 00:41:57.639 A:middle L:90%
the value of this is that otherwise, when you

939
00:41:57.639 --> 00:41:59.099 A:middle L:90%
look at a bill, you have no idea why

940
00:41:59.099 --> 00:42:00.880 A:middle L:90%
it passed. But so to be able to look

941
00:42:00.880 --> 00:42:02.320 A:middle L:90%
at the video clips when it was discussed on the

942
00:42:02.320 --> 00:42:05.989 A:middle L:90%
floor of the General Assembly to see a discussion of

943
00:42:05.989 --> 00:42:07.690 A:middle L:90%
exactly why the will is being proposed in what accomplishes

944
00:42:07.820 --> 00:42:10.949 A:middle L:90%
it eliminates any mystery about legislative intent. You just

945
00:42:10.949 --> 00:42:14.159 A:middle L:90%
know and to be able to connect that with the

946
00:42:14.159 --> 00:42:16.349 A:middle L:90%
actual bills is is really important. Uh, we

947
00:42:16.349 --> 00:42:19.440 A:middle L:90%
did that. So in addition to using that data

948
00:42:19.440 --> 00:42:20.800 A:middle L:90%
on my site which, like, I don't really

949
00:42:20.800 --> 00:42:22.940 A:middle L:90%
much care about, um it's more that I also

950
00:42:22.940 --> 00:42:23.619 A:middle L:90%
give the data away. So this is the download

951
00:42:23.619 --> 00:42:25.789 A:middle L:90%
section of Richmond Sunlight, which is very glamorous looking

952
00:42:25.800 --> 00:42:29.019 A:middle L:90%
and that video index file at the bottom there,

953
00:42:29.030 --> 00:42:30.340 A:middle L:90%
if we open it up. This is a file

954
00:42:30.340 --> 00:42:34.699 A:middle L:90%
format called Jason JavaScript. Object notation is kind of

955
00:42:34.699 --> 00:42:37.190 A:middle L:90%
like XML. It's a way to represent data in

956
00:42:37.190 --> 00:42:39.730 A:middle L:90%
a really raw way. So every time somebody spoken

957
00:42:39.730 --> 00:42:43.030 A:middle L:90%
on the floor of the Legislature, we record here

958
00:42:43.230 --> 00:42:44.719 A:middle L:90%
the chamber, the name of the speaker of the

959
00:42:44.719 --> 00:42:45.880 A:middle L:90%
bill they were addressing when they started speaking when they

960
00:42:45.880 --> 00:42:47.769 A:middle L:90%
stopped speaking and then link to the video file.

961
00:42:49.179 --> 00:42:51.170 A:middle L:90%
Anybody can grab the data and do anything they want

962
00:42:51.170 --> 00:42:52.360 A:middle L:90%
with it, which is bound to be much more

963
00:42:52.360 --> 00:42:53.460 A:middle L:90%
interesting than anything that I've done with it that seems

964
00:42:53.460 --> 00:42:57.280 A:middle L:90%
ripe for some pretty great analysis. I still have

965
00:42:57.280 --> 00:42:59.260 A:middle L:90%
not persuaded the General Assembly to that they should do

966
00:42:59.260 --> 00:43:00.519 A:middle L:90%
this. They have all sorts of really lame excuses

967
00:43:00.519 --> 00:43:02.550 A:middle L:90%
as to why they haven't published this stuff. I

968
00:43:02.550 --> 00:43:05.739 A:middle L:90%
don't know what their deal is. Second story,

969
00:43:05.750 --> 00:43:07.469 A:middle L:90%
business records, Uh, the record of every corporation

970
00:43:07.469 --> 00:43:10.889 A:middle L:90%
registered in Virginia. There about 800,000 corporations in Virginia

971
00:43:10.889 --> 00:43:13.929 A:middle L:90%
that exists now or have existed in the past.

972
00:43:13.929 --> 00:43:15.940 A:middle L:90%
I think the records go back maybe 15 years.

973
00:43:15.230 --> 00:43:17.559 A:middle L:90%
These records are maintained by the State Corporation Commission,

974
00:43:17.559 --> 00:43:20.539 A:middle L:90%
and they're wildly valuable, as you can imagine for

975
00:43:20.539 --> 00:43:22.170 A:middle L:90%
all kinds of purposes. Um, just any business

976
00:43:22.170 --> 00:43:24.130 A:middle L:90%
in which you need leads. There is a strong

977
00:43:24.130 --> 00:43:27.389 A:middle L:90%
example. Uh, but no doubt we can all

978
00:43:27.389 --> 00:43:29.010 A:middle L:90%
think of many examples of why it might be useful

979
00:43:29.010 --> 00:43:30.599 A:middle L:90%
to know what businesses actually exist in your town.

980
00:43:30.760 --> 00:43:34.099 A:middle L:90%
In your county, in your state. Who owns

981
00:43:34.099 --> 00:43:36.059 A:middle L:90%
them? Who is on the on the board?

982
00:43:36.070 --> 00:43:38.280 A:middle L:90%
How many shares people own name changes. It's all

983
00:43:38.280 --> 00:43:40.530 A:middle L:90%
really useful stuff. So the only way to get

984
00:43:40.530 --> 00:43:43.650 A:middle L:90%
this data right now is from the State Corporation Commission's

985
00:43:43.650 --> 00:43:45.559 A:middle L:90%
website. One record at a time. There are

986
00:43:45.559 --> 00:43:46.360 A:middle L:90%
800,000, and you have to do it manually on

987
00:43:46.360 --> 00:43:50.280 A:middle L:90%
a website, and the available information they provide is

988
00:43:50.280 --> 00:43:52.539 A:middle L:90%
incomplete. They actually have more data than they bother

989
00:43:52.539 --> 00:43:54.050 A:middle L:90%
to display on the website. But like the Journal

990
00:43:54.050 --> 00:43:55.739 A:middle L:90%
assembly. They will sell it to you. It's

991
00:43:55.739 --> 00:43:59.730 A:middle L:90%
$150 a month subscription fee, which I've been paying

992
00:44:00.090 --> 00:44:02.679 A:middle L:90%
since April, I think. And I'm grateful to

993
00:44:02.679 --> 00:44:06.510 A:middle L:90%
my wife for her tolerance of this budget line item

994
00:44:06.519 --> 00:44:07.840 A:middle L:90%
. In our household, there are seven paying customers

995
00:44:07.840 --> 00:44:09.309 A:middle L:90%
. I foiled the list. I'm number seven.

996
00:44:09.789 --> 00:44:12.849 A:middle L:90%
Uh, I have a hard time believing that it

997
00:44:12.849 --> 00:44:15.199 A:middle L:90%
costs them less to administer this program than they make

998
00:44:15.210 --> 00:44:17.019 A:middle L:90%
an income. Uh, this can't be like a

999
00:44:17.019 --> 00:44:21.329 A:middle L:90%
very sensible use of their time billing me for$400

1000
00:44:21.329 --> 00:44:22.659 A:middle L:90%
every three months. Uh, so, uh,

1001
00:44:22.670 --> 00:44:25.429 A:middle L:90%
luckily, great organization called the Shuttleworth Foundation sent me

1002
00:44:25.429 --> 00:44:29.050 A:middle L:90%
out of the blue$5000 this summer to defray my

1003
00:44:29.050 --> 00:44:30.619 A:middle L:90%
cost of buying government data That should be free and

1004
00:44:30.619 --> 00:44:32.489 A:middle L:90%
giving it away to actually make it free. Uh

1005
00:44:32.500 --> 00:44:35.369 A:middle L:90%
, which was nice of them. So I had

1006
00:44:35.369 --> 00:44:37.789 A:middle L:90%
to sign this five page contract. Remember the rules

1007
00:44:37.789 --> 00:44:40.000 A:middle L:90%
of openness. I had to sign a contract to

1008
00:44:40.000 --> 00:44:43.679 A:middle L:90%
receive public data. That's like a big red flag

1009
00:44:43.690 --> 00:44:45.630 A:middle L:90%
, I think for the state Right there. So

1010
00:44:45.639 --> 00:44:46.420 A:middle L:90%
So this, you know, fails to be open

1011
00:44:46.420 --> 00:44:49.539 A:middle L:90%
data. Uh, but the contract I read over

1012
00:44:49.539 --> 00:44:52.269 A:middle L:90%
carefully and I consulted with the Electronic Frontier Foundation in

1013
00:44:52.269 --> 00:44:53.920 A:middle L:90%
the American Civil Liberties Union. Uh, can I

1014
00:44:53.920 --> 00:44:55.579 A:middle L:90%
give this data away or they're gonna sue me?

1015
00:44:55.579 --> 00:44:58.230 A:middle L:90%
And they said, I think you're okay. We

1016
00:44:58.230 --> 00:44:59.659 A:middle L:90%
think you can actually get this data away. The

1017
00:44:59.659 --> 00:45:01.099 A:middle L:90%
contract doesn't prohibit it. So I signed the contract

1018
00:45:01.099 --> 00:45:02.820 A:middle L:90%
sending the check, and as soon as he gave

1019
00:45:02.820 --> 00:45:06.050 A:middle L:90%
me access within 10 minutes, I'd written a script

1020
00:45:06.130 --> 00:45:07.780 A:middle L:90%
. They would download the file, which they update

1021
00:45:07.780 --> 00:45:09.389 A:middle L:90%
at 2 a.m. Every Wednesday morning and uploaded to a

1022
00:45:09.389 --> 00:45:12.809 A:middle L:90%
website where anybody can download it for free so they

1023
00:45:12.809 --> 00:45:14.489 A:middle L:90%
will never get any more customers. They are capped

1024
00:45:14.489 --> 00:45:15.579 A:middle L:90%
at seven. Because you'd be crazy to pay for

1025
00:45:15.579 --> 00:45:19.099 A:middle L:90%
something that I'm giving away for free. Uh,

1026
00:45:19.110 --> 00:45:21.210 A:middle L:90%
and for me to download that data and give it

1027
00:45:21.210 --> 00:45:23.280 A:middle L:90%
away cost me upwards of several cents each month for

1028
00:45:23.280 --> 00:45:25.800 A:middle L:90%
the data transfer. Uh, so no big cost

1029
00:45:25.800 --> 00:45:28.760 A:middle L:90%
for me either. This is what the data looks

1030
00:45:28.760 --> 00:45:30.420 A:middle L:90%
like. It is really ugly. These are just

1031
00:45:30.420 --> 00:45:31.469 A:middle L:90%
five random business records. Believe it or not,

1032
00:45:31.469 --> 00:45:34.659 A:middle L:90%
there is a huge amount of data available in here

1033
00:45:34.670 --> 00:45:36.380 A:middle L:90%
. Uh, good luck finding it, but it

1034
00:45:36.380 --> 00:45:37.289 A:middle L:90%
is in there, and it was my unfortunate job

1035
00:45:37.300 --> 00:45:40.349 A:middle L:90%
to get it. Imagine going spelunking without a flashlight

1036
00:45:40.570 --> 00:45:44.219 A:middle L:90%
that what it was like for months trying to make

1037
00:45:44.219 --> 00:45:45.179 A:middle L:90%
my way through this file and extract useful data out

1038
00:45:45.179 --> 00:45:46.960 A:middle L:90%
of it. But I use that data to start

1039
00:45:46.960 --> 00:45:50.699 A:middle L:90%
an open source project, which turns the data into

1040
00:45:50.710 --> 00:45:52.900 A:middle L:90%
a spreadsheet and also into the aforementioned Jason, and

1041
00:45:52.900 --> 00:45:55.699 A:middle L:90%
then loads it into a database named Elasticsearch that makes

1042
00:45:55.699 --> 00:45:58.710 A:middle L:90%
it searchable. The heart of it all is this

1043
00:45:58.710 --> 00:46:00.710 A:middle L:90%
little command line program. I named it Trump for

1044
00:46:00.710 --> 00:46:02.409 A:middle L:90%
Beverly T. Crump, the very first SEC commissioner

1045
00:46:02.880 --> 00:46:06.110 A:middle L:90%
, and note that despite being a woman's name because

1046
00:46:06.110 --> 00:46:07.559 A:middle L:90%
it is somebody important in government, it's a dude

1047
00:46:07.570 --> 00:46:12.449 A:middle L:90%
, Beverly, that's That's Virginia's government. It was

1048
00:46:12.449 --> 00:46:15.010 A:middle L:90%
also 1900 the defense of Beverly. But what this

1049
00:46:15.010 --> 00:46:16.920 A:middle L:90%
program does, it operates through the file, which

1050
00:46:16.920 --> 00:46:19.940 A:middle L:90%
is, I think, millions of lines long and

1051
00:46:19.940 --> 00:46:22.619 A:middle L:90%
breaks it up into tiny little pieces, fixes formatting

1052
00:46:22.619 --> 00:46:23.949 A:middle L:90%
errors and text encoding errors, uh, and loads

1053
00:46:23.949 --> 00:46:27.590 A:middle L:90%
it all into that database to be indexed, and

1054
00:46:27.590 --> 00:46:28.900 A:middle L:90%
it pops up a website. Looks like this.

1055
00:46:28.900 --> 00:46:30.610 A:middle L:90%
This is all free open source software. Anybody who

1056
00:46:30.610 --> 00:46:31.400 A:middle L:90%
wanted to compete with me, I don't know why

1057
00:46:31.400 --> 00:46:32.719 A:middle L:90%
I don't make any money off. This is no

1058
00:46:32.719 --> 00:46:36.599 A:middle L:90%
advertising could download the software I've written and run it

1059
00:46:36.599 --> 00:46:37.519 A:middle L:90%
and set up their own website. So this is

1060
00:46:37.530 --> 00:46:40.130 A:middle L:90%
via businesses dot org. Anybody can download this bulk

1061
00:46:40.130 --> 00:46:42.519 A:middle L:90%
data that I get, but I put it better

1062
00:46:42.519 --> 00:46:45.809 A:middle L:90%
formats. There's an interface to review each record and

1063
00:46:45.809 --> 00:46:46.480 A:middle L:90%
browse through it. But with all the data,

1064
00:46:46.480 --> 00:46:49.869 A:middle L:90%
unlike the FCC's website, and I think that other

1065
00:46:49.869 --> 00:46:51.690 A:middle L:90%
people will do much more clever things with us than

1066
00:46:51.690 --> 00:46:52.500 A:middle L:90%
this ugly little website that I have slept together.

1067
00:46:52.980 --> 00:46:57.579 A:middle L:90%
Also courtesy of Code for America, the Hampton Roads

1068
00:46:57.579 --> 00:47:00.340 A:middle L:90%
chapter, a developer there without me having to do

1069
00:47:00.340 --> 00:47:04.099 A:middle L:90%
any work whatsoever. Put an A p the application

1070
00:47:04.099 --> 00:47:06.099 A:middle L:90%
programming interface on it for the site, which is

1071
00:47:06.099 --> 00:47:07.679 A:middle L:90%
wonderful because of what it means is that somebody can

1072
00:47:07.679 --> 00:47:09.059 A:middle L:90%
enter a U. R l like the one here

1073
00:47:09.059 --> 00:47:12.079 A:middle L:90%
website address. So this is saying, Give me

1074
00:47:12.079 --> 00:47:14.789 A:middle L:90%
any businesses named Gilley's in the city of Blacksburg,

1075
00:47:15.380 --> 00:47:16.570 A:middle L:90%
and it returns exactly one result, and it informs

1076
00:47:16.570 --> 00:47:19.349 A:middle L:90%
us that Gillies, a restaurant that I was fond

1077
00:47:19.349 --> 00:47:20.630 A:middle L:90%
of when I lived around the corner from here in

1078
00:47:20.630 --> 00:47:22.059 A:middle L:90%
Blacksburg, on college at Jan. Arguably, is

1079
00:47:22.059 --> 00:47:24.539 A:middle L:90%
the president vice president, treasurer, secretary and Richard

1080
00:47:24.539 --> 00:47:27.500 A:middle L:90%
agent. This is clearly a very capable woman.

1081
00:47:27.980 --> 00:47:30.239 A:middle L:90%
She incorporated in 1983. There are 20,000 shares she

1082
00:47:30.239 --> 00:47:32.110 A:middle L:90%
owns all of them. And we have this exact

1083
00:47:32.110 --> 00:47:37.840 A:middle L:90%
same data available for 800,000 other businesses. A funny

1084
00:47:37.840 --> 00:47:39.210 A:middle L:90%
outcome of this that I never anticipated in any way

1085
00:47:39.679 --> 00:47:43.769 A:middle L:90%
. I got a phone call from a employee of

1086
00:47:43.769 --> 00:47:45.599 A:middle L:90%
a town here in Virginia back in October. Who

1087
00:47:45.599 --> 00:47:47.920 A:middle L:90%
said, Hey, I stumbled across this website of

1088
00:47:47.920 --> 00:47:51.550 A:middle L:90%
yours. Could we get the records for all businesses

1089
00:47:51.550 --> 00:47:52.989 A:middle L:90%
in our town? And I said, I suppose

1090
00:47:52.989 --> 00:47:54.670 A:middle L:90%
I could geo code all the data and break it

1091
00:47:54.670 --> 00:47:57.489 A:middle L:90%
up by town. Why do you need it?

1092
00:47:57.880 --> 00:47:59.750 A:middle L:90%
She said. Well, you know, we have

1093
00:47:59.750 --> 00:48:00.980 A:middle L:90%
a business license in a business license tax here,

1094
00:48:01.289 --> 00:48:04.579 A:middle L:90%
but we only know if somebody starts a business here

1095
00:48:04.579 --> 00:48:06.719 A:middle L:90%
. If they voluntarily fill out the form and tell

1096
00:48:06.719 --> 00:48:07.440 A:middle L:90%
us they need to pay taxes, we have no

1097
00:48:07.440 --> 00:48:09.320 A:middle L:90%
way of auditing it. We don't know what businesses

1098
00:48:09.320 --> 00:48:12.159 A:middle L:90%
in or out of town. I feel like that

1099
00:48:12.159 --> 00:48:13.980 A:middle L:90%
. Doesn't the State Corporation Commission tell you when there's

1100
00:48:13.980 --> 00:48:15.269 A:middle L:90%
a new business in your talent? Said No,

1101
00:48:15.269 --> 00:48:16.389 A:middle L:90%
no, no. We have no way of knowing

1102
00:48:16.400 --> 00:48:17.829 A:middle L:90%
as but can't you ask them for the day?

1103
00:48:17.829 --> 00:48:20.639 A:middle L:90%
They said, Well, no. And I explained

1104
00:48:20.639 --> 00:48:22.000 A:middle L:90%
to her they could pay for the data, But

1105
00:48:22.000 --> 00:48:22.730 A:middle L:90%
she said, We're not programmers. We wouldn't know

1106
00:48:22.730 --> 00:48:24.630 A:middle L:90%
what to do with this big file that is so

1107
00:48:24.630 --> 00:48:28.050 A:middle L:90%
ugly. So really, this seems like a terrible

1108
00:48:28.050 --> 00:48:30.440 A:middle L:90%
system. So I taught myself how to over the

1109
00:48:30.440 --> 00:48:32.110 A:middle L:90%
next few weeks, Geo code, all the data

1110
00:48:32.110 --> 00:48:35.309 A:middle L:90%
, as we say, to assign latitude and longitudes

1111
00:48:35.309 --> 00:48:37.300 A:middle L:90%
to every business and got the boundary data for every

1112
00:48:37.300 --> 00:48:39.699 A:middle L:90%
town and county and city in the state and then

1113
00:48:39.699 --> 00:48:42.780 A:middle L:90%
center the records for that town. I then heard

1114
00:48:42.780 --> 00:48:45.659 A:middle L:90%
from the Commission of Revenue from my city who wanted

1115
00:48:45.659 --> 00:48:46.619 A:middle L:90%
the same data and the round up being a couple

1116
00:48:46.619 --> 00:48:49.210 A:middle L:90%
of stories in the press about a couple weeks ago

1117
00:48:49.210 --> 00:48:51.059 A:middle L:90%
. And I've heard from a bunch of counties that

1118
00:48:51.059 --> 00:48:52.329 A:middle L:90%
are saying, Oh my God, this would be

1119
00:48:52.329 --> 00:48:53.460 A:middle L:90%
wonderful. We have no way of knowing who's not

1120
00:48:53.460 --> 00:48:57.110 A:middle L:90%
paying. We did an audit of Charlottesville Records and

1121
00:48:57.110 --> 00:49:00.909 A:middle L:90%
found that they're missing somewhere. Well, they're missing

1122
00:49:00.909 --> 00:49:04.349 A:middle L:90%
1900 businesses, so I think about a third of

1123
00:49:04.349 --> 00:49:06.090 A:middle L:90%
their businesses they don't know our in the city.

1124
00:49:06.469 --> 00:49:07.809 A:middle L:90%
Now, not all businesses have to pay license fees

1125
00:49:07.849 --> 00:49:09.809 A:middle L:90%
, and it's not clear exactly what percentage do.

1126
00:49:09.820 --> 00:49:14.440 A:middle L:90%
But the average amount that businesses paying license tax to

1127
00:49:14.440 --> 00:49:15.739 A:middle L:90%
Charlottesville every year is about 1500 change, so they're

1128
00:49:15.739 --> 00:49:20.460 A:middle L:90%
missing out about$2 million in revenue. If all

1129
00:49:20.460 --> 00:49:22.590 A:middle L:90%
those businesses are supposed to pay license taxes, if

1130
00:49:22.590 --> 00:49:24.469 A:middle L:90%
we pretend that only 80% of them are exempted for

1131
00:49:24.469 --> 00:49:27.190 A:middle L:90%
some reason, that strikes me as real unlikely.

1132
00:49:27.199 --> 00:49:30.510 A:middle L:90%
There's still 600,000 year they're missing out on now,

1133
00:49:30.510 --> 00:49:31.659 A:middle L:90%
just for scale. The city's budget is 100 50

1134
00:49:31.659 --> 00:49:34.550 A:middle L:90%
year, and they get to go back and recover

1135
00:49:34.550 --> 00:49:37.039 A:middle L:90%
five missing years. So even if it's a million

1136
00:49:37.039 --> 00:49:37.590 A:middle L:90%
, they get to put a five. There's a

1137
00:49:37.590 --> 00:49:40.219 A:middle L:90%
$4.5 million budget shortfall in Charlottesville right now. They

1138
00:49:40.219 --> 00:49:44.780 A:middle L:90%
can fix that just by with this audit and which

1139
00:49:44.780 --> 00:49:45.449 A:middle L:90%
they're doing right now. They're using these records to

1140
00:49:45.449 --> 00:49:47.059 A:middle L:90%
audit. They need to hire a new auditor because

1141
00:49:47.059 --> 00:49:51.389 A:middle L:90%
they've never tried to audit 1900 records, but they're

1142
00:49:51.389 --> 00:49:53.150 A:middle L:90%
they're doing their best, uh, looking ahead.

1143
00:49:53.150 --> 00:49:55.030 A:middle L:90%
I want the State Corporation Commission to do this.

1144
00:49:55.039 --> 00:49:58.369 A:middle L:90%
It is crazy that I am doing this, That

1145
00:49:58.369 --> 00:50:00.710 A:middle L:90%
we're going to have almost 200 localities when you include

1146
00:50:00.719 --> 00:50:05.460 A:middle L:90%
towns failing to be able to audit their data unless

1147
00:50:05.460 --> 00:50:07.309 A:middle L:90%
I show up and do something in my spare time

1148
00:50:07.309 --> 00:50:08.739 A:middle L:90%
for fun. This is a really lousy system.

1149
00:50:08.750 --> 00:50:12.210 A:middle L:90%
So there's, uh the State C. I.

1150
00:50:12.210 --> 00:50:14.989 A:middle L:90%
O is becoming the chief administrative officer of the State

1151
00:50:14.989 --> 00:50:16.559 A:middle L:90%
Corporation Commission. Uh, he knows about my work

1152
00:50:16.559 --> 00:50:19.159 A:middle L:90%
. I'm optimistic that he'll show up and say,

1153
00:50:19.159 --> 00:50:21.739 A:middle L:90%
Guys, what are you doing? Why are we

1154
00:50:21.739 --> 00:50:24.440 A:middle L:90%
not helping our own localities? And by the way

1155
00:50:24.449 --> 00:50:27.260 A:middle L:90%
, I have been told that this state is not

1156
00:50:27.260 --> 00:50:29.550 A:middle L:90%
being used anywhere else either, including state Department of

1157
00:50:29.550 --> 00:50:31.170 A:middle L:90%
Taxation. If they wanted to know what businesses are

1158
00:50:31.170 --> 00:50:34.460 A:middle L:90%
not paying taxes, that they are not able to

1159
00:50:34.460 --> 00:50:36.429 A:middle L:90%
use this data to audit their own records. This

1160
00:50:36.429 --> 00:50:37.659 A:middle L:90%
is a terrible system. So when I say open

1161
00:50:37.659 --> 00:50:40.130 A:middle L:90%
data by government for government, this is the example

1162
00:50:40.559 --> 00:50:45.130 A:middle L:90%
. We're looking at$200 million across the state.

1163
00:50:45.559 --> 00:50:47.159 A:middle L:90%
I mean, how conservative are optimistic? You want

1164
00:50:47.159 --> 00:50:49.690 A:middle L:90%
to be in your in your estimates. Feeling more

1165
00:50:49.690 --> 00:50:52.010 A:middle L:90%
concerned to say 100 million in missing taxes to localities

1166
00:50:52.019 --> 00:50:55.300 A:middle L:90%
, because the state isn't sharing their data This is

1167
00:50:55.300 --> 00:51:00.250 A:middle L:90%
crazy. So we wrap up with the state of

1168
00:51:00.250 --> 00:51:01.929 A:middle L:90%
the Commonwealth, a quick rundown of where Virginia stands

1169
00:51:01.929 --> 00:51:04.480 A:middle L:90%
in its provision of some key data sets. Now

1170
00:51:04.480 --> 00:51:06.449 A:middle L:90%
there's nothing magic about these data sets that have listed

1171
00:51:06.449 --> 00:51:07.809 A:middle L:90%
here. I happen to regard them as important for

1172
00:51:07.809 --> 00:51:09.960 A:middle L:90%
states to provide. There is there are no particular

1173
00:51:09.960 --> 00:51:13.400 A:middle L:90%
order, but let's let's go through them. Legislation

1174
00:51:13.400 --> 00:51:15.039 A:middle L:90%
and legislators. The Legislature provides spreadsheets of bills and

1175
00:51:15.039 --> 00:51:16.929 A:middle L:90%
votes. That's great. There's no budget data.

1176
00:51:16.929 --> 00:51:19.989 A:middle L:90%
There's no legislative data. There's no committee data.

1177
00:51:20.159 --> 00:51:22.579 A:middle L:90%
Uh, and getting the data is Unadvertised. It

1178
00:51:22.579 --> 00:51:24.719 A:middle L:90%
requires registration. It's not updated off enough. I

1179
00:51:24.730 --> 00:51:30.079 A:middle L:90%
give them a C Geo data. Virginia does two

1180
00:51:30.079 --> 00:51:31.630 A:middle L:90%
amazing things with geographic data. The first is there

1181
00:51:31.630 --> 00:51:34.789 A:middle L:90%
one of, I think two, maybe three states

1182
00:51:34.789 --> 00:51:37.449 A:middle L:90%
in the country to publish as of December, it

1183
00:51:37.449 --> 00:51:39.269 A:middle L:90%
was known as an address points database, every address

1184
00:51:39.269 --> 00:51:42.769 A:middle L:90%
, every assigned address in the state. They have

1185
00:51:42.769 --> 00:51:44.800 A:middle L:90%
a latitude and longitude for, and they'll give you

1186
00:51:44.800 --> 00:51:45.309 A:middle L:90%
that list. It's available. You can download it

1187
00:51:45.309 --> 00:51:49.090 A:middle L:90%
freely. That is huge. That is incredibly valuable

1188
00:51:49.289 --> 00:51:51.219 A:middle L:90%
. The cost of geo coding data, which all

1189
00:51:51.219 --> 00:51:53.300 A:middle L:90%
sorts of businesses need to convert addresses into actual locations

1190
00:51:53.519 --> 00:51:55.989 A:middle L:90%
is prohibitive. When I needed to geo code my

1191
00:51:55.989 --> 00:51:59.349 A:middle L:90%
Virginia business data it was going to cost me$1200

1192
00:51:59.349 --> 00:52:00.849 A:middle L:90%
a week to use a commercial geo coded like,

1193
00:52:00.849 --> 00:52:04.199 A:middle L:90%
Every time I run it, I have to pay

1194
00:52:04.199 --> 00:52:06.519 A:middle L:90%
that that feet. Um, even if I was

1195
00:52:06.519 --> 00:52:08.260 A:middle L:90%
really, uh, cautious in my my software development

1196
00:52:08.269 --> 00:52:09.530 A:middle L:90%
, uh, and how I did it, it

1197
00:52:09.530 --> 00:52:13.139 A:middle L:90%
would still be 1200 upfront and hundreds of dollars every

1198
00:52:13.139 --> 00:52:15.449 A:middle L:90%
subsequent week. Now it's free. It doesn't cost

1199
00:52:15.449 --> 00:52:16.949 A:middle L:90%
me a thing. Thanks, Virginia. Virginia does

1200
00:52:16.949 --> 00:52:19.639 A:middle L:90%
another really great thing with its geo data. It

1201
00:52:19.639 --> 00:52:22.590 A:middle L:90%
has an openly available geo coder. So they have

1202
00:52:22.590 --> 00:52:23.530 A:middle L:90%
an AP, which I think we all know about

1203
00:52:23.530 --> 00:52:27.170 A:middle L:90%
now and happy where you can give it an address

1204
00:52:27.179 --> 00:52:29.219 A:middle L:90%
and it returns a cleaned up, proper version of

1205
00:52:29.219 --> 00:52:30.969 A:middle L:90%
the address and its latitude and longitude. That's that's

1206
00:52:30.969 --> 00:52:34.469 A:middle L:90%
magic. That's wonderful. That is incredibly rare thing

1207
00:52:34.469 --> 00:52:36.619 A:middle L:90%
for states to do all the rest of the geo

1208
00:52:36.619 --> 00:52:37.230 A:middle L:90%
data. And my God, there's a lot.

1209
00:52:37.289 --> 00:52:42.239 A:middle L:90%
I mean, congressional districts and house districts and soil

1210
00:52:42.239 --> 00:52:45.440 A:middle L:90%
types and watersheds. All these things it's complicated and

1211
00:52:45.440 --> 00:52:47.929 A:middle L:90%
it's spotty. It's poorly documented, but by God

1212
00:52:47.929 --> 00:52:50.880 A:middle L:90%
to give them so much credit for the address points

1213
00:52:50.880 --> 00:52:52.690 A:middle L:90%
database and the openly available geo coding. I give

1214
00:52:52.690 --> 00:52:54.389 A:middle L:90%
Virginia B plus on Geo data judicial records. There

1215
00:52:54.389 --> 00:52:58.519 A:middle L:90%
is none f nothing. There's nothing. It's a

1216
00:52:58.519 --> 00:53:00.710 A:middle L:90%
wasteland. Laws and regulations, best in the nation

1217
00:53:00.940 --> 00:53:05.449 A:middle L:90%
. Nobody is even close. Virginia Has there not

1218
00:53:05.449 --> 00:53:07.280 A:middle L:90%
been a second place? By the way, Virginia

1219
00:53:07.280 --> 00:53:08.429 A:middle L:90%
has, uh, an A P I and bulk

1220
00:53:08.440 --> 00:53:12.730 A:middle L:90%
data for all laws, all regulations, charters,

1221
00:53:12.730 --> 00:53:15.000 A:middle L:90%
Constitution A and a bunch of pluses, like whatever

1222
00:53:15.000 --> 00:53:17.110 A:middle L:90%
you're supposed to do with an eBay review like that

1223
00:53:17.119 --> 00:53:20.789 A:middle L:90%
, a plus plus plus plus. Plus, I

1224
00:53:20.800 --> 00:53:22.159 A:middle L:90%
could be proud of Virginia in that respect. Corporate

1225
00:53:22.159 --> 00:53:24.090 A:middle L:90%
Data D For reasons that are clear to all of

1226
00:53:24.090 --> 00:53:25.829 A:middle L:90%
you. Now the data exists. But come on

1227
00:53:25.829 --> 00:53:29.639 A:middle L:90%
, it's awful. Campaign finance data. The State

1228
00:53:29.639 --> 00:53:31.769 A:middle L:90%
Board of Elections provides bulk data for campaign finance reports

1229
00:53:31.780 --> 00:53:34.269 A:middle L:90%
. They just started a couple years ago. They're

1230
00:53:34.269 --> 00:53:37.110 A:middle L:90%
not going back there. Hard to gather, and

1231
00:53:37.110 --> 00:53:38.230 A:middle L:90%
that's all that's provided. There are no election results

1232
00:53:38.269 --> 00:53:40.769 A:middle L:90%
, but that's still a lot better than most states

1233
00:53:40.769 --> 00:53:43.489 A:middle L:90%
. So on a curve, I'll give him a

1234
00:53:43.489 --> 00:53:46.530 A:middle L:90%
B minus. They're doing some innovative stuff. Health

1235
00:53:46.530 --> 00:53:49.909 A:middle L:90%
data. Now there's a lot of potential health data

1236
00:53:49.909 --> 00:53:51.929 A:middle L:90%
out there So let's just set the bar real low

1237
00:53:51.929 --> 00:53:52.449 A:middle L:90%
because that's where it needs to be for health data

1238
00:53:52.449 --> 00:53:54.309 A:middle L:90%
, because it's in a bad way. Nationally,

1239
00:53:54.320 --> 00:53:58.679 A:middle L:90%
let's just say restaurant inspections, nursing home inspections and

1240
00:53:58.679 --> 00:54:00.619 A:middle L:90%
vaccination data. So there is no restaurant inspection data

1241
00:54:00.619 --> 00:54:04.510 A:middle L:90%
, just the Canadian website that isn't even indexed by

1242
00:54:04.510 --> 00:54:07.639 A:middle L:90%
Google. That's where. Health space dot com How

1243
00:54:07.639 --> 00:54:07.789 A:middle L:90%
space dot c? A. I think it's the

1244
00:54:07.789 --> 00:54:09.550 A:middle L:90%
site. Uh, that's where we keep our data

1245
00:54:09.550 --> 00:54:12.920 A:middle L:90%
. And, like Internet Ghetto, Uh, so

1246
00:54:12.920 --> 00:54:15.900 A:middle L:90%
that's not data that's barely information. The vaccine data

1247
00:54:15.900 --> 00:54:17.510 A:middle L:90%
is messy, but it's their lousy data for nursing

1248
00:54:17.510 --> 00:54:21.170 A:middle L:90%
home inspections. I give Virginia D on health data

1249
00:54:21.179 --> 00:54:22.869 A:middle L:90%
and finally having a central data repository. So the

1250
00:54:22.869 --> 00:54:25.230 A:middle L:90%
good news is there is a data. Virginia Gov

1251
00:54:25.239 --> 00:54:28.099 A:middle L:90%
. The bad news is that it's bad. Uh

1252
00:54:28.110 --> 00:54:29.980 A:middle L:90%
, it's not a proper data repository. It's a

1253
00:54:29.980 --> 00:54:31.650 A:middle L:90%
website dressed up to look like a data repository.

1254
00:54:31.650 --> 00:54:34.849 A:middle L:90%
It's just a pile of HTML, but at least

1255
00:54:34.849 --> 00:54:37.369 A:middle L:90%
it exists. So see, uh, and I

1256
00:54:37.369 --> 00:54:40.579 A:middle L:90%
think that's gonna be improving soon. So here's what

1257
00:54:40.579 --> 00:54:44.840 A:middle L:90%
we need. Just example that I could go on

1258
00:54:44.840 --> 00:54:45.980 A:middle L:90%
and on. But these are some core types of

1259
00:54:45.980 --> 00:54:49.349 A:middle L:90%
data that Virginia should provide but does not in some

1260
00:54:49.349 --> 00:54:51.610 A:middle L:90%
of this you'll notice is not very exciting. Some

1261
00:54:51.610 --> 00:54:53.360 A:middle L:90%
of it is really boring a list of all schools

1262
00:54:53.840 --> 00:54:55.860 A:middle L:90%
. Yeah, but how else are you gonna know

1263
00:54:55.860 --> 00:54:58.800 A:middle L:90%
what all the schools are? And that's useful for

1264
00:54:58.800 --> 00:55:00.980 A:middle L:90%
all kinds of things. And these things change schools

1265
00:55:00.980 --> 00:55:04.239 A:middle L:90%
or move or you build new schools, their new

1266
00:55:04.239 --> 00:55:06.349 A:middle L:90%
principles, Uh, the grades, they serve my

1267
00:55:06.349 --> 00:55:07.340 A:middle L:90%
change. Um, you know, So there's a

1268
00:55:07.340 --> 00:55:09.429 A:middle L:90%
lot of types of data like this that are all

1269
00:55:09.429 --> 00:55:13.139 A:middle L:90%
elected officials that is not known. The state does

1270
00:55:13.139 --> 00:55:14.710 A:middle L:90%
not know what all the elected officials are in the

1271
00:55:14.710 --> 00:55:15.269 A:middle L:90%
state. Not only they not know. They don't

1272
00:55:15.269 --> 00:55:19.809 A:middle L:90%
know how they're going to find out. This seems

1273
00:55:19.809 --> 00:55:22.320 A:middle L:90%
like a problem. So for the new requirements to

1274
00:55:22.320 --> 00:55:23.969 A:middle L:90%
file these ethics reports from elected officials across the state

1275
00:55:24.070 --> 00:55:25.960 A:middle L:90%
state doesn't know who's supposed to file because they don't

1276
00:55:25.960 --> 00:55:28.500 A:middle L:90%
know who is elected. They have no way of

1277
00:55:28.500 --> 00:55:31.079 A:middle L:90%
finding out data. How does it work? Uh

1278
00:55:31.090 --> 00:55:34.889 A:middle L:90%
, but I'll include with some positive trends. Geo

1279
00:55:34.889 --> 00:55:37.989 A:middle L:90%
data is improving fast code for Hampton Roads have been

1280
00:55:37.989 --> 00:55:39.159 A:middle L:90%
doing some great data opening up health data with our

1281
00:55:39.159 --> 00:55:43.429 A:middle L:90%
open health inspections out by doing the really awkward work

1282
00:55:43.429 --> 00:55:45.880 A:middle L:90%
of pulling that data, scraping as we call it

1283
00:55:45.889 --> 00:55:49.070 A:middle L:90%
out of health space dot c. A business data

1284
00:55:49.070 --> 00:55:51.449 A:middle L:90%
is improving with a new data and website. AP

1285
00:55:51.449 --> 00:55:52.860 A:middle L:90%
. The current governor isn't interested in open data,

1286
00:55:53.099 --> 00:55:55.480 A:middle L:90%
and they co opted me by getting me to write

1287
00:55:55.480 --> 00:55:58.659 A:middle L:90%
the transition plan for the new governor for open data

1288
00:55:59.039 --> 00:56:00.039 A:middle L:90%
and then helping them execute it. So I hope

1289
00:56:00.039 --> 00:56:01.550 A:middle L:90%
that will add up to some sort of good.

1290
00:56:02.030 --> 00:56:05.099 A:middle L:90%
So I think Virginia is finally on an upswing.

1291
00:56:05.110 --> 00:56:07.659 A:middle L:90%
Uh, for years, it was embarrassing to the

1292
00:56:07.659 --> 00:56:08.079 A:middle L:90%
conference is about open data, and people say,

1293
00:56:08.079 --> 00:56:12.489 A:middle L:90%
Hey, Virginia, how are things there? Good

1294
00:56:12.500 --> 00:56:14.369 A:middle L:90%
, Grim. It's real grim, like there was

1295
00:56:14.369 --> 00:56:15.340 A:middle L:90%
never any sign to 2.2 to say, like things

1296
00:56:15.340 --> 00:56:19.099 A:middle L:90%
are getting better. But in the last 12 months

1297
00:56:19.110 --> 00:56:21.059 A:middle L:90%
, things have changed hugely, and I think it

1298
00:56:21.059 --> 00:56:22.719 A:middle L:90%
just it's timing. I don't I don't think it

1299
00:56:22.719 --> 00:56:23.889 A:middle L:90%
necessarily. Some of the credit goes to a new

1300
00:56:23.889 --> 00:56:25.750 A:middle L:90%
governor in who's looking for new ideas, and that

1301
00:56:25.750 --> 00:56:28.840 A:middle L:90%
happens with the change of administration, and some of

1302
00:56:28.840 --> 00:56:30.920 A:middle L:90%
it comes to agency is just saying Now we have

1303
00:56:30.920 --> 00:56:34.659 A:middle L:90%
to improve how we do things. So yeah,

1304
00:56:35.030 --> 00:56:37.039 A:middle L:90%
questions. What have I said, that is stupid

1305
00:56:37.039 --> 00:56:38.750 A:middle L:90%
or wrong or confusing or what have I not explained

1306
00:56:38.750 --> 00:56:43.639 A:middle L:90%
properly? I did everything perfectly. Is that yes

1307
00:56:43.639 --> 00:56:47.329 A:middle L:90%
. Megan? Um, no. But particular are

1308
00:56:47.340 --> 00:56:53.960 A:middle L:90%
working with commercial. Right? Veronica's if you to

1309
00:56:53.969 --> 00:56:57.739 A:middle L:90%
do to try to mash up some of the data

1310
00:56:57.739 --> 00:57:00.320 A:middle L:90%
when they get involved, Does that make it a

1311
00:57:00.329 --> 00:57:05.650 A:middle L:90%
prohibitive a situation graduate? That on the back?

1312
00:57:07.429 --> 00:57:09.239 A:middle L:90%
So there's a good news bad news of when governments

1313
00:57:09.239 --> 00:57:14.570 A:middle L:90%
go to companies that specialize in working with government to

1314
00:57:14.570 --> 00:57:16.369 A:middle L:90%
make government data information available. The good news is

1315
00:57:16.369 --> 00:57:20.550 A:middle L:90%
many governments lack the internal expertise to do this themselves

1316
00:57:20.809 --> 00:57:22.360 A:middle L:90%
. And they not only do they not know how

1317
00:57:22.360 --> 00:57:22.670 A:middle L:90%
to do it, they don't even know how to

1318
00:57:22.670 --> 00:57:25.250 A:middle L:90%
hire somebody to do it. And so they're Granic

1319
00:57:25.250 --> 00:57:28.849 A:middle L:90%
asses, the big vendor for video. If you're

1320
00:57:28.849 --> 00:57:30.849 A:middle L:90%
looking at video online, like from the General Assembly

1321
00:57:30.849 --> 00:57:31.650 A:middle L:90%
, are really just about any other government country.

1322
00:57:31.650 --> 00:57:35.190 A:middle L:90%
It's coming through Caracas. Um, Gracchus has been

1323
00:57:35.190 --> 00:57:37.360 A:middle L:90%
talking about every year or two, they announced they're

1324
00:57:37.360 --> 00:57:37.860 A:middle L:90%
going to have a big new open data initiative.

1325
00:57:37.860 --> 00:57:40.030 A:middle L:90%
All this video is going to be available, and

1326
00:57:40.030 --> 00:57:43.780 A:middle L:90%
it never actually seems to happen. And so it's

1327
00:57:43.780 --> 00:57:46.699 A:middle L:90%
become sort of a ghetto for open data like it's

1328
00:57:46.699 --> 00:57:51.280 A:middle L:90%
where open data goes to die like people ignore it

1329
00:57:51.280 --> 00:57:52.829 A:middle L:90%
there. It's never going to come out. There's

1330
00:57:52.829 --> 00:57:54.420 A:middle L:90%
no way for it to work its way up into

1331
00:57:54.420 --> 00:57:57.730 A:middle L:90%
proper open data. Uh, Secreta is one of

1332
00:57:57.730 --> 00:58:00.809 A:middle L:90%
the major vendors what's known as open data repositories central

1333
00:58:00.809 --> 00:58:02.289 A:middle L:90%
software, whereas one website you can go to,

1334
00:58:02.289 --> 00:58:05.780 A:middle L:90%
like data Virginia dot gov and all the data that's

1335
00:58:05.780 --> 00:58:07.559 A:middle L:90%
available can be registered there and made searchable. So

1336
00:58:07.559 --> 00:58:09.440 A:middle L:90%
, for instance, on the federal level website that

1337
00:58:09.440 --> 00:58:14.170 A:middle L:90%
I made some years ago ethics dot gov where you

1338
00:58:14.170 --> 00:58:15.050 A:middle L:90%
can type in the name of anybody. And it

1339
00:58:15.050 --> 00:58:16.780 A:middle L:90%
will tell you how they visited the White House.

1340
00:58:16.780 --> 00:58:19.389 A:middle L:90%
If so, how many times have they donated?

1341
00:58:19.389 --> 00:58:20.889 A:middle L:90%
If so, to whom have they registered as a

1342
00:58:20.889 --> 00:58:22.500 A:middle L:90%
lobbyist? When who do they lobby for? It

1343
00:58:22.500 --> 00:58:25.130 A:middle L:90%
searches all these data sets in one place. So

1344
00:58:25.130 --> 00:58:27.809 A:middle L:90%
the secret is the major vendor in that space.

1345
00:58:27.820 --> 00:58:30.579 A:middle L:90%
Uh uh. There isn't that problem with secret to

1346
00:58:30.579 --> 00:58:31.659 A:middle L:90%
the good news is when you load your datasets into

1347
00:58:31.659 --> 00:58:34.539 A:middle L:90%
that, they can come out again. Uh,

1348
00:58:34.550 --> 00:58:37.429 A:middle L:90%
they remain accessible and, um, fungible and movable

1349
00:58:37.429 --> 00:58:40.030 A:middle L:90%
and so on. Generally, I think it's a

1350
00:58:40.030 --> 00:58:43.389 A:middle L:90%
really positive step. When governments are willing to work

1351
00:58:43.389 --> 00:58:45.880 A:middle L:90%
with vendors like that. We need so many more

1352
00:58:45.070 --> 00:58:47.780 A:middle L:90%
. There's so few vendors. My organization, in

1353
00:58:47.780 --> 00:58:52.480 A:middle L:90%
fact, just awarded a large contract this week to

1354
00:58:52.480 --> 00:58:57.170 A:middle L:90%
create open data repository software that is so easy to

1355
00:58:57.170 --> 00:59:00.429 A:middle L:90%
launch and manage child repositories that we hope it will

1356
00:59:00.429 --> 00:59:01.420 A:middle L:90%
get a bunch of people in the business that people

1357
00:59:01.420 --> 00:59:04.840 A:middle L:90%
say. I guess I could be a government contractor

1358
00:59:04.840 --> 00:59:07.340 A:middle L:90%
and sell government or software data repositories now because it's

1359
00:59:07.340 --> 00:59:08.139 A:middle L:90%
just so easy to do with this software. We're

1360
00:59:08.139 --> 00:59:10.329 A:middle L:90%
giving the software away because we think that we really

1361
00:59:10.329 --> 00:59:14.070 A:middle L:90%
need competition in this space. So it's it's a

1362
00:59:14.070 --> 00:59:15.900 A:middle L:90%
mixed bag is to answer your question. But I'm

1363
00:59:15.900 --> 00:59:16.880 A:middle L:90%
glad to see the forward trend of governments trying to

1364
00:59:16.880 --> 00:59:19.530 A:middle L:90%
release data, even if they're not doing it and

1365
00:59:19.530 --> 00:59:21.329 A:middle L:90%
what I think is a great way just yet.

1366
00:59:22.019 --> 00:59:25.340 A:middle L:90%
What else? So on your list of, uh

1367
00:59:25.719 --> 00:59:29.480 A:middle L:90%
, what's needed or we need one of these schools

1368
00:59:29.480 --> 00:59:35.050 A:middle L:90%
, which you mentioned so when you mentioned that we

1369
00:59:35.050 --> 00:59:38.969 A:middle L:90%
need it, are you proposing that someone develop something

1370
00:59:38.980 --> 00:59:42.050 A:middle L:90%
to scrape this and manage such a database? Are

1371
00:59:42.050 --> 00:59:46.289 A:middle L:90%
you suggesting that administratively or legislatively, the state make

1372
00:59:46.289 --> 00:59:50.980 A:middle L:90%
some changes that filter through the school systems that assemble

1373
00:59:50.980 --> 00:59:54.309 A:middle L:90%
this information. What I have my own. Like

1374
00:59:54.320 --> 00:59:58.760 A:middle L:90%
John GALT Ian fantasy of all developers say no more

1375
00:59:58.760 --> 01:00:00.590 A:middle L:90%
scrapers. We're not writing any more scrapers. That's

1376
01:00:00.590 --> 01:00:02.409 A:middle L:90%
it. If government isn't releasing data, that's it

1377
01:00:02.409 --> 01:00:05.030 A:middle L:90%
. You can't get it. We're done. Um

1378
01:00:05.030 --> 01:00:06.340 A:middle L:90%
, and that would be, uh, wow.

1379
01:00:06.340 --> 01:00:07.880 A:middle L:90%
It's alarming hearing a klaxon go off while talking because

1380
01:00:07.880 --> 01:00:10.369 A:middle L:90%
it's like either I've said something horribly wrong or we

1381
01:00:10.369 --> 01:00:13.420 A:middle L:90%
need to evacuate the building. It's a good reality

1382
01:00:13.420 --> 01:00:15.280 A:middle L:90%
check, though. I appreciate that and re evaluate

1383
01:00:15.280 --> 01:00:16.909 A:middle L:90%
everything I just said, Um uh, yeah,

1384
01:00:16.920 --> 01:00:19.590 A:middle L:90%
so I was not going to happen. But I

1385
01:00:19.590 --> 01:00:22.030 A:middle L:90%
do like it is a common topic among my fellow

1386
01:00:22.030 --> 01:00:23.510 A:middle L:90%
developers of like we have to stop. It's like

1387
01:00:23.510 --> 01:00:28.340 A:middle L:90%
you're you're enabling somebody's bad behavior, like buying beers

1388
01:00:28.340 --> 01:00:30.110 A:middle L:90%
for an alcoholic like No, we'll just scrape that

1389
01:00:30.110 --> 01:00:31.099 A:middle L:90%
data for you. Um, what we really need

1390
01:00:31.099 --> 01:00:34.219 A:middle L:90%
to do is for government to release this. God

1391
01:00:34.219 --> 01:00:35.829 A:middle L:90%
help us. If the state doesn't have a list

1392
01:00:35.829 --> 01:00:37.360 A:middle L:90%
of all at schools and that could be if the

1393
01:00:37.360 --> 01:00:39.039 A:middle L:90%
state doesn't know all the elected officials are, maybe

1394
01:00:39.039 --> 01:00:42.480 A:middle L:90%
the state doesn't know But I propose to you that

1395
01:00:42.489 --> 01:00:44.300 A:middle L:90%
either a they have the data, in which case

1396
01:00:44.300 --> 01:00:45.829 A:middle L:90%
publish it. Why not? What's the harm?

1397
01:00:45.829 --> 01:00:46.639 A:middle L:90%
Make it available to everybody? People will do interesting

1398
01:00:46.639 --> 01:00:49.320 A:middle L:90%
things with it. I bet localities could use it

1399
01:00:49.820 --> 01:00:51.920 A:middle L:90%
or B if they don't have a list of all

1400
01:00:51.920 --> 01:00:53.030 A:middle L:90%
the schools. And that's the real problem. Not

1401
01:00:53.030 --> 01:00:54.940 A:middle L:90%
that the data is not available, but the state

1402
01:00:54.940 --> 01:00:57.960 A:middle L:90%
doesn't know what the schools are. And I don't

1403
01:00:57.960 --> 01:00:59.960 A:middle L:90%
know which outcome is more alarming that they have a

1404
01:00:59.960 --> 01:01:00.550 A:middle L:90%
list of schools and they won't give it to us

1405
01:01:00.559 --> 01:01:02.909 A:middle L:90%
or they just don't know what their schools are.

1406
01:01:02.920 --> 01:01:06.059 A:middle L:90%
So for all of these data sets, I would

1407
01:01:06.059 --> 01:01:07.659 A:middle L:90%
like government to release these. I think it would

1408
01:01:07.659 --> 01:01:09.679 A:middle L:90%
be useful to the private sector I know would be

1409
01:01:09.679 --> 01:01:15.769 A:middle L:90%
useful to government quick follow up. So that means

1410
01:01:15.769 --> 01:01:16.789 A:middle L:90%
that for schools, for example, there must be

1411
01:01:16.789 --> 01:01:23.119 A:middle L:90%
private databases that are sold to places like Zillow dot

1412
01:01:23.119 --> 01:01:25.719 A:middle L:90%
com, where you can find what schools. So

1413
01:01:25.719 --> 01:01:29.159 A:middle L:90%
that's what's happening now is that some of this data

1414
01:01:29.170 --> 01:01:30.809 A:middle L:90%
is available, but it's expensive with, as with

1415
01:01:30.809 --> 01:01:34.030 A:middle L:90%
the example of geo coding, I could have paid

1416
01:01:34.030 --> 01:01:36.750 A:middle L:90%
$1200 to get latitude and why couldn't like that wasn't

1417
01:01:36.750 --> 01:01:37.630 A:middle L:90%
actually going to happen. But theoretically, one could

1418
01:01:37.630 --> 01:01:40.380 A:middle L:90%
paid$1200 to get all of those addresses. Geo

1419
01:01:40.380 --> 01:01:43.880 A:middle L:90%
coded in practice. There's a lot of innovation that

1420
01:01:43.880 --> 01:01:46.500 A:middle L:90%
simply doesn't happen as a result. So uh,

1421
01:01:46.510 --> 01:01:49.880 A:middle L:90%
yeah, so sometimes the data does exist for exactly

1422
01:01:49.880 --> 01:01:52.190 A:middle L:90%
for things like Zillow, but it's it's tough to

1423
01:01:52.190 --> 01:01:52.449 A:middle L:90%
know that it's accurate. It's tough. No,

1424
01:01:52.449 --> 01:01:53.960 A:middle L:90%
that's up to date. And then it's only available

1425
01:01:53.960 --> 01:01:59.630 A:middle L:90%
to Zillow. So you're Yeah, the cost of

1426
01:01:59.630 --> 01:02:01.199 A:middle L:90%
work the government has already done. The private sector

1427
01:02:01.199 --> 01:02:05.900 A:middle L:90%
has to bear repeatedly, and this is incredibly inefficient

1428
01:02:05.909 --> 01:02:07.030 A:middle L:90%
, and I think it's also in the abstract.

1429
01:02:07.039 --> 01:02:09.610 A:middle L:90%
But I think it's also it's unfair to taxpayers who

1430
01:02:09.710 --> 01:02:13.289 A:middle L:90%
, returning to my list of ineffective arguments, have

1431
01:02:13.289 --> 01:02:15.599 A:middle L:90%
already paid for this data. Why do we need

1432
01:02:15.599 --> 01:02:17.420 A:middle L:90%
to pay for it again when it's already available?

1433
01:02:19.400 --> 01:02:21.710 A:middle L:90%
Mhm. Well, they are related to the sort

1434
01:02:21.710 --> 01:02:23.769 A:middle L:90%
of quasi commercial activities of the state. The lottery

1435
01:02:23.769 --> 01:02:25.730 A:middle L:90%
comes to mind. Maybe there's some other ones to

1436
01:02:25.739 --> 01:02:28.579 A:middle L:90%
lotteries actually, on my list, like, why

1437
01:02:28.579 --> 01:02:30.000 A:middle L:90%
not put up the list of all the lottery numbers

1438
01:02:30.000 --> 01:02:30.059 A:middle L:90%
? I don't know what somebody's going to do with

1439
01:02:30.059 --> 01:02:32.570 A:middle L:90%
a list of every Like that data is not useful

1440
01:02:32.570 --> 01:02:35.380 A:middle L:90%
the next day, you know, like either you

1441
01:02:35.380 --> 01:02:37.219 A:middle L:90%
want or you didn't. But but yeah, I

1442
01:02:37.219 --> 01:02:39.280 A:middle L:90%
want that available. Yeah, like as a amateur

1443
01:02:39.280 --> 01:02:42.630 A:middle L:90%
statistician, which is the most dangerous kind of statistician

1444
01:02:42.630 --> 01:02:44.349 A:middle L:90%
, by the way, Uh, I would love

1445
01:02:44.349 --> 01:02:45.449 A:middle L:90%
to analyze winning lottery numbers, Virginia. That's,

1446
01:02:45.460 --> 01:02:49.880 A:middle L:90%
uh, if it's not totally random, that's alarming

1447
01:02:49.880 --> 01:02:52.000 A:middle L:90%
, right? Like, yeah, I think there's

1448
01:02:52.000 --> 01:02:52.139 A:middle L:90%
a lot of data like that. That would be

1449
01:02:52.139 --> 01:02:59.920 A:middle L:90%
really interesting. Yeah. Any other questions? Yes

1450
01:02:59.920 --> 01:03:02.929 A:middle L:90%
, sir. Okay. Yeah. It's crazy passion

1451
01:03:02.940 --> 01:03:07.619 A:middle L:90%
. I love it. Uh, specific area.

1452
01:03:09.179 --> 01:03:16.010 A:middle L:90%
You government board open data that you really right.

1453
01:03:17.199 --> 01:03:22.920 A:middle L:90%
Really? I'm really excited right now, and yet

1454
01:03:22.920 --> 01:03:24.340 A:middle L:90%
I've done nothing on it. So maybe I'm some

1455
01:03:24.340 --> 01:03:29.769 A:middle L:90%
cognitive dissonance here of transportation data. There's a gold

1456
01:03:29.769 --> 01:03:30.989 A:middle L:90%
mine and transportation data. So the Virginia Department of

1457
01:03:30.989 --> 01:03:34.949 A:middle L:90%
Transportation on there, I think it's 511 Virginia dot

1458
01:03:34.949 --> 01:03:37.420 A:middle L:90%
org Virginia 51 I don't know any other website where

1459
01:03:37.420 --> 01:03:38.639 A:middle L:90%
you can see at all times. Uh, what

1460
01:03:38.639 --> 01:03:42.710 A:middle L:90%
are the incidents occurring right now? What lanes are

1461
01:03:42.710 --> 01:03:44.809 A:middle L:90%
closed? What? Construction is planned? Uh,

1462
01:03:44.820 --> 01:03:45.510 A:middle L:90%
where has there been an accident? Where the traffic

1463
01:03:45.510 --> 01:03:46.960 A:middle L:90%
cameras so that I can look at the traffic in

1464
01:03:46.960 --> 01:03:50.610 A:middle L:90%
real time. They provide all of that, but

1465
01:03:50.610 --> 01:03:52.460 A:middle L:90%
only on their website. And so I wrote some

1466
01:03:52.460 --> 01:03:53.690 A:middle L:90%
code automatically pull it off their website. Within a

1467
01:03:53.690 --> 01:03:58.900 A:middle L:90%
few days, they blocked me. Why? This

1468
01:03:58.900 --> 01:04:00.440 A:middle L:90%
is really useful. I would love. I don't

1469
01:04:00.440 --> 01:04:02.570 A:middle L:90%
want to have to visit Virginia 511 dot org,

1470
01:04:02.570 --> 01:04:05.360 A:middle L:90%
which looks terrible on the phone before every trip.

1471
01:04:05.369 --> 01:04:08.559 A:middle L:90%
I just want Google Maps to be able to grab

1472
01:04:08.559 --> 01:04:11.730 A:middle L:90%
their data and tell me you're getting off on 11

1473
01:04:11.739 --> 01:04:13.860 A:middle L:90%
. Get off 81 getting 11. Just trust me

1474
01:04:13.869 --> 01:04:15.409 A:middle L:90%
. You're gonna be happier if you do that.

1475
01:04:15.690 --> 01:04:16.699 A:middle L:90%
I shouldn't have to review it and plan my own

1476
01:04:16.699 --> 01:04:18.619 A:middle L:90%
trip. It's 2015, like our software should be

1477
01:04:18.619 --> 01:04:20.860 A:middle L:90%
able to react to this. So there's a lot

1478
01:04:20.860 --> 01:04:24.210 A:middle L:90%
of ways in which transportation that's that's a really pedestrian

1479
01:04:24.210 --> 01:04:26.309 A:middle L:90%
. Well, I shouldn't potentially has a literal meaning

1480
01:04:26.309 --> 01:04:29.030 A:middle L:90%
here, but has a really boring, uh,

1481
01:04:29.059 --> 01:04:30.500 A:middle L:90%
mundane example of the value of it. But the

1482
01:04:30.510 --> 01:04:33.110 A:middle L:90%
lack of data sharing on a local, state and

1483
01:04:33.110 --> 01:04:36.280 A:middle L:90%
federal level between transportation agencies on both mass transit and

1484
01:04:36.280 --> 01:04:41.059 A:middle L:90%
transportation generally is appalling and wasteful, and it is

1485
01:04:41.059 --> 01:04:43.099 A:middle L:90%
really holding back a lot of innovation. So I'm

1486
01:04:43.099 --> 01:04:44.940 A:middle L:90%
excited about that. And a close runner up is

1487
01:04:44.940 --> 01:04:46.420 A:middle L:90%
healthcare data. The amount of money waiting to be

1488
01:04:46.420 --> 01:04:50.869 A:middle L:90%
saved in health data is vast. Doctors ought to

1489
01:04:50.869 --> 01:04:54.340 A:middle L:90%
be able to see that patient who has come in

1490
01:04:54.489 --> 01:04:56.920 A:middle L:90%
four times with asthma attacks in the past year.

1491
01:04:57.289 --> 01:05:00.719 A:middle L:90%
She lives down wind from a paper mill and her

1492
01:05:00.719 --> 01:05:04.539 A:middle L:90%
asthma attacks coincide with incidents we see from EPA data

1493
01:05:04.639 --> 01:05:08.400 A:middle L:90%
of excess pollution coming from their which we can now

1494
01:05:08.400 --> 01:05:10.809 A:middle L:90%
see kills 10 people a year in this one town

1495
01:05:11.190 --> 01:05:14.000 A:middle L:90%
because we can correlate those data. So then doctors

1496
01:05:14.000 --> 01:05:15.690 A:middle L:90%
can say, Oh, I see here from asthma

1497
01:05:15.690 --> 01:05:17.340 A:middle L:90%
attack I also see that where you live is downwind

1498
01:05:17.340 --> 01:05:20.139 A:middle L:90%
and your attack Yes, you need to move the

1499
01:05:20.139 --> 01:05:21.519 A:middle L:90%
little girl. Are you going to die? Tough

1500
01:05:21.519 --> 01:05:24.619 A:middle L:90%
advice for a nine year old possibly hard to act

1501
01:05:24.619 --> 01:05:27.829 A:middle L:90%
upon, but certainly variable, useful public policy data

1502
01:05:27.840 --> 01:05:29.860 A:middle L:90%
. So there's a lot of ways that health data

1503
01:05:29.860 --> 01:05:33.420 A:middle L:90%
is also waiting to be used for individual value for

1504
01:05:33.420 --> 01:05:38.489 A:middle L:90%
community value for hospital value. In particular, as

1505
01:05:38.489 --> 01:05:40.940 A:middle L:90%
we see the Affordable Care Act do more to bend

1506
01:05:40.940 --> 01:05:45.090 A:middle L:90%
the cost curve, healthcare doctors and hospitals are gonna

1507
01:05:45.090 --> 01:05:46.239 A:middle L:90%
have a strong incentive, a financial incentive to keep

1508
01:05:46.239 --> 01:05:49.139 A:middle L:90%
patients well. And so they're about to be very

1509
01:05:49.139 --> 01:05:51.820 A:middle L:90%
excited about data like this that will allow them to

1510
01:05:51.820 --> 01:05:55.750 A:middle L:90%
drive down their costs by keeping people healthier. Because

1511
01:05:55.750 --> 01:05:58.920 A:middle L:90%
at the moment we're not blaming an individual doctors here

1512
01:05:58.960 --> 01:06:00.219 A:middle L:90%
, but the industry, On the whole, it's

1513
01:06:00.219 --> 01:06:02.590 A:middle L:90%
like the power industry. The power industry is supposed

1514
01:06:02.590 --> 01:06:05.219 A:middle L:90%
to tell people to use less power. That's all

1515
01:06:05.219 --> 01:06:08.079 A:middle L:90%
they sell. Why do they want people to confess

1516
01:06:08.079 --> 01:06:10.590 A:middle L:90%
crazy? There's no other industry in which would make

1517
01:06:10.590 --> 01:06:13.380 A:middle L:90%
them tell people and, like put systems in place

1518
01:06:13.380 --> 01:06:15.929 A:middle L:90%
to use less of their product. Well, tobacco

1519
01:06:15.929 --> 01:06:16.800 A:middle L:90%
, I suppose there's another one kind of alcohol.

1520
01:06:17.179 --> 01:06:19.460 A:middle L:90%
Um, but as with health care, like for

1521
01:06:19.460 --> 01:06:23.329 A:middle L:90%
a doctor, I mean, unless you're a monster

1522
01:06:23.329 --> 01:06:25.659 A:middle L:90%
, this isn't individually true. But there's a perverse

1523
01:06:25.659 --> 01:06:28.590 A:middle L:90%
incentive you want your patients to be sick because that's

1524
01:06:28.590 --> 01:06:30.150 A:middle L:90%
your business model. So we need to find ways

1525
01:06:30.150 --> 01:06:34.369 A:middle L:90%
to help doctors see ways that they can make individual

1526
01:06:34.369 --> 01:06:36.460 A:middle L:90%
and all of their patients better in the affordable character

1527
01:06:36.460 --> 01:06:39.070 A:middle L:90%
, starting to force that. So they're about to

1528
01:06:39.079 --> 01:06:40.599 A:middle L:90%
be pretty excited about that with health care data.

1529
01:06:41.280 --> 01:06:44.460 A:middle L:90%
Sorry, saw another hand somewhere, So you talked

1530
01:06:44.460 --> 01:06:47.440 A:middle L:90%
about how different levels of government and the same levels

1531
01:06:47.440 --> 01:06:50.880 A:middle L:90%
of government don't communicate with each other about with with

1532
01:06:50.889 --> 01:06:55.539 A:middle L:90%
regard to sharing their data. What is their response

1533
01:06:55.539 --> 01:07:00.019 A:middle L:90%
when you speak to only love? Yeah, So

1534
01:07:00.019 --> 01:07:02.880 A:middle L:90%
their response to because you're basically being government needs to

1535
01:07:02.880 --> 01:07:04.599 A:middle L:90%
change. We need to change how we do.

1536
01:07:04.980 --> 01:07:09.750 A:middle L:90%
You know, when I talked to audiences that are

1537
01:07:09.750 --> 01:07:12.130 A:middle L:90%
mostly government, a reliable laugh, you're gonna have

1538
01:07:12.130 --> 01:07:13.340 A:middle L:90%
to trust me because this will not be funny to

1539
01:07:13.340 --> 01:07:14.769 A:middle L:90%
interview. But a reliable laugh line is when I

1540
01:07:14.769 --> 01:07:17.039 A:middle L:90%
say Here's what data looks like in government or open

1541
01:07:17.039 --> 01:07:19.039 A:middle L:90%
data looks like in government. You walking down and

1542
01:07:19.039 --> 01:07:20.840 A:middle L:90%
knocking on Norma Jean's door and saying, Hey,

1543
01:07:20.840 --> 01:07:24.409 A:middle L:90%
Norma Jean, listen, I know you're the database

1544
01:07:24.409 --> 01:07:26.489 A:middle L:90%
administrator. Could you export some data for me?

1545
01:07:26.489 --> 01:07:29.289 A:middle L:90%
Here's a thumb drive that would be great. Or

1546
01:07:29.300 --> 01:07:31.269 A:middle L:90%
it's going to like the L Drive on the network

1547
01:07:31.280 --> 01:07:34.969 A:middle L:90%
and going through the crazy, calcified network structure that

1548
01:07:34.969 --> 01:07:38.500 A:middle L:90%
somebody actually put together in 1999. I didn't think

1549
01:07:38.500 --> 01:07:40.380 A:middle L:90%
it would still be in use to find the data

1550
01:07:40.380 --> 01:07:43.340 A:middle L:90%
that you need. There's a belief among open data

1551
01:07:43.340 --> 01:07:45.860 A:middle L:90%
types in the private sector. The government is just

1552
01:07:45.860 --> 01:07:47.210 A:middle L:90%
sitting on, like these amazing databases of all of

1553
01:07:47.210 --> 01:07:49.070 A:middle L:90%
their data that they can access and they just won't

1554
01:07:49.070 --> 01:07:51.750 A:middle L:90%
share it. And it's a it's a mess in

1555
01:07:51.750 --> 01:07:56.769 A:middle L:90%
there. And so when I talk about improving the

1556
01:07:56.780 --> 01:07:59.849 A:middle L:90%
provision of data within government for data people in government

1557
01:07:59.849 --> 01:08:00.349 A:middle L:90%
, that's often when their heads started like, Oh

1558
01:08:00.349 --> 01:08:02.880 A:middle L:90%
, thank God, somebody's thinking of us because from

1559
01:08:02.880 --> 01:08:04.940 A:middle L:90%
their perspective, it can be really frustrating to see

1560
01:08:04.949 --> 01:08:08.090 A:middle L:90%
all this talk about improving data for the public.

1561
01:08:08.090 --> 01:08:10.469 A:middle L:90%
And they're thinking, Yeah, but what about me

1562
01:08:10.480 --> 01:08:12.389 A:middle L:90%
? Our system is terrible. How do we make

1563
01:08:12.389 --> 01:08:16.609 A:middle L:90%
my job easier? And so it's usually received enthusiastically

1564
01:08:16.609 --> 01:08:18.800 A:middle L:90%
, and I T folks might object, but their

1565
01:08:18.800 --> 01:08:23.970 A:middle L:90%
objection is Usually we don't have the resources. Yeah

1566
01:08:23.979 --> 01:08:26.109 A:middle L:90%
, because we regard I t. As like these

1567
01:08:26.109 --> 01:08:28.369 A:middle L:90%
Scooby people who are supposed to live in the basement

1568
01:08:28.380 --> 01:08:30.199 A:middle L:90%
, who we pay not wildly well, but just

1569
01:08:30.199 --> 01:08:32.210 A:middle L:90%
enough to appease the technology gods because it's some add

1570
01:08:32.210 --> 01:08:35.359 A:middle L:90%
on that we need instead of recognizing that it's 2015

1571
01:08:35.359 --> 01:08:38.920 A:middle L:90%
. It's the underpinning of a lot of governments and

1572
01:08:38.920 --> 01:08:43.670 A:middle L:90%
technology, and society is mediated through digital technology,

1573
01:08:43.989 --> 01:08:45.899 A:middle L:90%
and so it is no longer a bolt on.

1574
01:08:45.270 --> 01:08:48.970 A:middle L:90%
It's It's how government needs to work. And so

1575
01:08:48.979 --> 01:08:51.319 A:middle L:90%
that's That's when I start to win over the people

1576
01:08:51.319 --> 01:08:53.819 A:middle L:90%
in government. Yes, thank you, Thank you

1577
01:08:53.819 --> 01:08:56.199 A:middle L:90%
. We are important. But I think that's usually

1578
01:08:56.199 --> 01:08:58.119 A:middle L:90%
why it's received pretty well by folks, folks within

1579
01:08:58.119 --> 01:09:00.220 A:middle L:90%
government. And there's a big move in the federal

1580
01:09:00.220 --> 01:09:02.840 A:middle L:90%
government to change this. After the disaster of healthcare

1581
01:09:02.840 --> 01:09:05.840 A:middle L:90%
dot gov, which was outsourced for a fortune famously

1582
01:09:05.840 --> 01:09:10.039 A:middle L:90%
to several companies who did an awful job, there

1583
01:09:10.039 --> 01:09:12.079 A:middle L:90%
was a realization within the federal government. You know

1584
01:09:12.079 --> 01:09:14.300 A:middle L:90%
, that that site was bailed out by a dozen

1585
01:09:14.300 --> 01:09:15.340 A:middle L:90%
people, a few of whom I was pleased to

1586
01:09:15.340 --> 01:09:16.670 A:middle L:90%
work with when I was in the White House a

1587
01:09:16.670 --> 01:09:20.619 A:middle L:90%
few years ago who were called up, many of

1588
01:09:20.619 --> 01:09:23.600 A:middle L:90%
whom were no longer in federal service, were called

1589
01:09:23.600 --> 01:09:25.369 A:middle L:90%
up, saying, We need you, your country

1590
01:09:25.369 --> 01:09:27.050 A:middle L:90%
needs you. Please report to a hotel room in

1591
01:09:27.050 --> 01:09:28.680 A:middle L:90%
Columbia, Maryland, and you're going to live here

1592
01:09:28.680 --> 01:09:30.810 A:middle L:90%
indefinitely and fix this website. The website was fixed

1593
01:09:30.819 --> 01:09:33.949 A:middle L:90%
by a dozen people in a hotel suite on sleepless

1594
01:09:33.949 --> 01:09:36.949 A:middle L:90%
nights when they said the whole assistant we used to

1595
01:09:36.949 --> 01:09:41.029 A:middle L:90%
build this this whole junkie website is out. We're

1596
01:09:41.029 --> 01:09:43.729 A:middle L:90%
doing it our way, and they updated the website

1597
01:09:43.739 --> 01:09:45.079 A:middle L:90%
daily. They constantly fixed, and it got so

1598
01:09:45.079 --> 01:09:48.090 A:middle L:90%
much better in a month. It took years to

1599
01:09:48.090 --> 01:09:49.930 A:middle L:90%
screw up. They fixed in no time. So

1600
01:09:49.930 --> 01:09:55.369 A:middle L:90%
this attitude has been, um, made the way

1601
01:09:55.369 --> 01:09:57.329 A:middle L:90%
the government is trying to work now with this new

1602
01:09:57.340 --> 01:09:59.619 A:middle L:90%
organization called 18 F Because it's the corner of 18

1603
01:09:59.619 --> 01:10:01.060 A:middle L:90%
F Street. It's sort of an internal i t

1604
01:10:01.060 --> 01:10:04.159 A:middle L:90%
procurement agency for the federal government so agencies can work

1605
01:10:04.159 --> 01:10:08.149 A:middle L:90%
with brilliant I t. People who direct directly for

1606
01:10:08.149 --> 01:10:10.170 A:middle L:90%
the federal government who are regarding it. Working for

1607
01:10:10.170 --> 01:10:12.630 A:middle L:90%
18 F is a form of federal service like they're

1608
01:10:12.630 --> 01:10:14.579 A:middle L:90%
not going into the military. They're working for 18

1609
01:10:14.579 --> 01:10:16.460 A:middle L:90%
F. They're going to make half, maybe 10%

1610
01:10:16.460 --> 01:10:18.779 A:middle L:90%
of what they could make in San Francisco. But

1611
01:10:18.789 --> 01:10:20.220 A:middle L:90%
by God, they're going to serve their country and

1612
01:10:20.220 --> 01:10:21.890 A:middle L:90%
their working 18 F for a couple years and show

1613
01:10:21.890 --> 01:10:24.510 A:middle L:90%
that there are better ways to do it in this

1614
01:10:24.510 --> 01:10:26.819 A:middle L:90%
country. And I hope that starts trickling down to

1615
01:10:26.819 --> 01:10:29.319 A:middle L:90%
a state level at a local level. And I

1616
01:10:29.319 --> 01:10:31.199 A:middle L:90%
sure hope that Virginia emulates this and creates an internal

1617
01:10:31.199 --> 01:10:34.020 A:middle L:90%
I T agency because our current approach of outsourcing everything

1618
01:10:34.020 --> 01:10:35.340 A:middle L:90%
to is it Northrop Grumman. Is that who has

1619
01:10:35.340 --> 01:10:38.920 A:middle L:90%
the contract? It's a disaster. It's gone terribly

1620
01:10:38.920 --> 01:10:41.189 A:middle L:90%
for everybody. And I think we need to yank

1621
01:10:41.189 --> 01:10:44.180 A:middle L:90%
that contract and hire competent people. It's a dream

1622
01:10:44.760 --> 01:10:45.369 A:middle L:90%
. I'll take one more question before I let you

1623
01:10:45.369 --> 01:10:46.670 A:middle L:90%
. Poor people go, but I will hang out

1624
01:10:46.670 --> 01:10:48.609 A:middle L:90%
and talk to anybody for as long as people want

1625
01:10:48.609 --> 01:10:53.069 A:middle L:90%
to talk to me. Yes, scary. But

1626
01:10:53.079 --> 01:10:55.390 A:middle L:90%
the question that I have is how do we continue

1627
01:10:55.390 --> 01:10:58.770 A:middle L:90%
this conversation? I'm relatively new, higher things,

1628
01:10:58.770 --> 01:11:02.329 A:middle L:90%
but I am alright. Salad. We need to

1629
01:11:02.329 --> 01:11:04.520 A:middle L:90%
have it happen. My background, agriculture. And

1630
01:11:04.520 --> 01:11:08.180 A:middle L:90%
so I want inter agency operations to the ailment back

1631
01:11:08.180 --> 01:11:11.420 A:middle L:90%
two years or my farmer's got home. And so

1632
01:11:11.430 --> 01:11:14.779 A:middle L:90%
I need to figure out a way to be able

1633
01:11:14.779 --> 01:11:18.149 A:middle L:90%
to continue this conversation and to be take one take

1634
01:11:18.149 --> 01:11:20.069 A:middle L:90%
home message, help me try to craft that.

1635
01:11:20.460 --> 01:11:24.739 A:middle L:90%
Sure, So ag data is huge and it is

1636
01:11:24.739 --> 01:11:27.800 A:middle L:90%
really huge. There's a I think it's only a

1637
01:11:27.800 --> 01:11:30.279 A:middle L:90%
four or five year old company. Now that Monsanto

1638
01:11:30.279 --> 01:11:32.289 A:middle L:90%
bought for a billion dollars last year, this company

1639
01:11:32.289 --> 01:11:36.609 A:middle L:90%
combined open agriculture data soil, the aforementioned soil data

1640
01:11:36.649 --> 01:11:41.609 A:middle L:90%
, soil data, uh, weather data and crop

1641
01:11:41.609 --> 01:11:44.159 A:middle L:90%
data, all public data. They combine to this

1642
01:11:44.159 --> 01:11:46.539 A:middle L:90%
in some very clever fashion fashions, rather, and

1643
01:11:46.550 --> 01:11:49.729 A:middle L:90%
they now have an insurance. Is crop insurance.

1644
01:11:49.729 --> 01:11:53.899 A:middle L:90%
Basically, they'll ensure your crops if you grow what

1645
01:11:53.899 --> 01:11:57.130 A:middle L:90%
they say will grow, well, where to grow

1646
01:11:57.130 --> 01:11:58.949 A:middle L:90%
them, and you plant when they say to plant

1647
01:11:58.949 --> 01:12:00.510 A:middle L:90%
and harvest when they say to plan because their data

1648
01:12:00.510 --> 01:12:02.939 A:middle L:90%
is so precise on a field by field basis,

1649
01:12:02.939 --> 01:12:05.460 A:middle L:90%
they'll say in your Northfield, your alfalfa and you

1650
01:12:05.460 --> 01:12:08.520 A:middle L:90%
plant it on the 18th. We're not gonna give

1651
01:12:08.520 --> 01:12:10.210 A:middle L:90%
you crop insurance if you know anything of the alfalfa

1652
01:12:10.220 --> 01:12:13.630 A:middle L:90%
, Go Planet 18th and the results are excellent crop

1653
01:12:13.630 --> 01:12:15.670 A:middle L:90%
. The price of the crop insurance goes way down

1654
01:12:15.050 --> 01:12:17.659 A:middle L:90%
because the results are much better. So Monsanto bought

1655
01:12:17.659 --> 01:12:20.109 A:middle L:90%
them for a billion bucks. It's entirely open data

1656
01:12:20.119 --> 01:12:23.659 A:middle L:90%
power company. It's been a huge success story,

1657
01:12:23.670 --> 01:12:27.319 A:middle L:90%
so I am psyched about AG data and farmers.

1658
01:12:27.329 --> 01:12:28.729 A:middle L:90%
For any of you, all without a background in

1659
01:12:28.729 --> 01:12:30.989 A:middle L:90%
that, like they got GPS and their tractors,

1660
01:12:30.989 --> 01:12:34.520 A:middle L:90%
now like they're planting, is incredibly precise. It

1661
01:12:34.520 --> 01:12:38.430 A:middle L:90%
is. It is absolutely a science, uh,

1662
01:12:38.430 --> 01:12:41.289 A:middle L:90%
and quite modern. They'll they'll even instant message in

1663
01:12:41.289 --> 01:12:43.390 A:middle L:90%
the field to each other between the tractors and and

1664
01:12:43.390 --> 01:12:45.060 A:middle L:90%
coordinate the iPads that might have mounted up in their

1665
01:12:45.060 --> 01:12:47.670 A:middle L:90%
tractor. And it's a pretty cool system. So

1666
01:12:48.050 --> 01:12:50.760 A:middle L:90%
my take home for you is this, he said

1667
01:12:50.760 --> 01:12:53.479 A:middle L:90%
, trying to think of the take home, Uh

1668
01:12:53.850 --> 01:12:57.520 A:middle L:90%
, I encourage you to think of one way in

1669
01:12:57.520 --> 01:13:01.930 A:middle L:90%
which there is data that either doesn't exist as open

1670
01:13:01.930 --> 01:13:05.619 A:middle L:90%
data, but probably exists somewhere that should exist,

1671
01:13:05.819 --> 01:13:09.109 A:middle L:90%
whether it be on a federal or on a state

1672
01:13:09.109 --> 01:13:10.920 A:middle L:90%
level and encourages it to think on those levels in

1673
01:13:10.920 --> 01:13:14.699 A:middle L:90%
particular. If you can identify what that is and

1674
01:13:14.699 --> 01:13:16.149 A:middle L:90%
come up with a use case here, our farmers

1675
01:13:16.149 --> 01:13:18.590 A:middle L:90%
I've talked to you got a one pager. I

1676
01:13:18.590 --> 01:13:20.720 A:middle L:90%
talked to this farmer who said this this farmer who

1677
01:13:20.720 --> 01:13:23.880 A:middle L:90%
said this at Virginia Tech. We're prepared to do

1678
01:13:23.880 --> 01:13:27.560 A:middle L:90%
this thing and then you approach both state and federal

1679
01:13:27.560 --> 01:13:29.770 A:middle L:90%
act departments say, Listen, here's the thing that

1680
01:13:29.770 --> 01:13:31.770 A:middle L:90%
would be possible if you would just published this data

1681
01:13:32.550 --> 01:13:34.319 A:middle L:90%
, and often it's available, but it's available as

1682
01:13:34.319 --> 01:13:36.500 A:middle L:90%
bad information or it's PDFs. You can't do anything

1683
01:13:36.500 --> 01:13:39.710 A:middle L:90%
with that. Um, I think you probably got

1684
01:13:39.710 --> 01:13:41.979 A:middle L:90%
a better reception on a state level. It's a

1685
01:13:41.979 --> 01:13:43.390 A:middle L:90%
little hard to be heard on a federal level.

1686
01:13:43.390 --> 01:13:45.020 A:middle L:90%
I don't know who? I don't think there's a

1687
01:13:45.020 --> 01:13:48.789 A:middle L:90%
chief data officer within the Department of Agricultural federally.

1688
01:13:48.800 --> 01:13:50.689 A:middle L:90%
Uh, my hope there's a c t o.

1689
01:13:50.689 --> 01:13:53.390 A:middle L:90%
But often C T O s are too bogged down

1690
01:13:53.390 --> 01:13:56.140 A:middle L:90%
with system administration to deal with this. Um,

1691
01:13:56.149 --> 01:13:57.510 A:middle L:90%
I I don't have No, I have no useful

1692
01:13:57.510 --> 01:14:00.029 A:middle L:90%
connection there. But if you can gin up something

1693
01:14:00.029 --> 01:14:01.800 A:middle L:90%
like that, that one case that can start to

1694
01:14:01.800 --> 01:14:04.399 A:middle L:90%
crack that not, uh, of getting the data

1695
01:14:04.399 --> 01:14:06.020 A:middle L:90%
exposed. And from there, it's a little easier

1696
01:14:06.020 --> 01:14:08.529 A:middle L:90%
to build out. You know, once I get

1697
01:14:08.529 --> 01:14:10.489 A:middle L:90%
that business data, that's all right. You know

1698
01:14:10.489 --> 01:14:12.550 A:middle L:90%
, for Virginia, we've got the business data released

1699
01:14:12.560 --> 01:14:15.050 A:middle L:90%
. Let's get licensing data. Oh, get license

1700
01:14:15.050 --> 01:14:15.930 A:middle L:90%
data. Great. Well, let's get records of

1701
01:14:15.939 --> 01:14:18.439 A:middle L:90%
professional admonishments, and you can build out from there

1702
01:14:18.439 --> 01:14:20.430 A:middle L:90%
. I think the same thing works just as well

1703
01:14:20.430 --> 01:14:24.479 A:middle L:90%
in an egg or anywhere else. Well, I'm

1704
01:14:24.479 --> 01:14:26.090 A:middle L:90%
going to hang out here as long as anybody wants

1705
01:14:26.090 --> 01:14:28.800 A:middle L:90%
to chat. I'm really grateful to you all coming

1706
01:14:28.800 --> 01:14:30.739 A:middle L:90%
out. I understand. Uh, it's not the

1707
01:14:30.739 --> 01:14:31.239 A:middle L:90%
most exciting thing to do on a Friday night to

1708
01:14:31.239 --> 01:14:33.279 A:middle L:90%
come out here talked about open data, but I

1709
01:14:33.279 --> 01:14:36.010 A:middle L:90%
am very excited to spend Friday night uh, we're

1710
01:14:36.010 --> 01:14:39.060 A:middle L:90%
talking about open data, and I appreciate having found

1711
01:14:39.060 --> 01:14:41.550 A:middle L:90%
some people who are likewise engaged at the end.

