Big Data and the Sciences

2021-12-07

Admin

Grades

I will send an estimate with error bars of your grade to everyone before class on Tuesday.

Makeup responses:

  1. Edward Snowden: American Hero
  2. Classify more ads; describe one additional one.

Makeup project/paper components: 1. Implement taxonomies on the ads.

In the Cloud

Big Data is the precondition for modern “Machine Learning.”

  • 1990-2010: Large datasets built, various algorithms for prediction.
  • 2013-present: Widespread use of neural networks with backpropagation.

How is Big Data different from data?

  • There’s more of it.
  • It lacks structure.
  • It lacks statistical sampling.

???

From http://ai.stanford.edu/~ang/papers/icml12-HighLevelFeaturesUsingUnsupervisedLearning.pdf

???

Try 2

  • Supervised: using training data with labels.
  • Unsupervised Learning: Attempting to learn intrinsic patterns.

With Supervision

Big Data before AI

Large Hadron Collider

Private-public rivals; Craig Venter, Celera Genetics.

Ventner’s method

Source-Adam Kucharski, www.sbs.com.au

Netflix Prize

Netflix’s goal, c. 2014

For DVDs our goal is to help people fill their queue with titles to receive in the mail over the coming days and weeks; selection is distant in time from viewing, people select carefully because exchanging a DVD for another takes more than a day, and we get no feedback during viewing. For streaming members are looking for something great to watch right now; they can sample a few videos before settling on one, they can consume several in one session, and we can observe viewing statistics such as whether a video was watched fully or only partially.

Server Farms

Street view of data center

Youtube…

Surveillance and Privacy

Surveillance states

Questions:

  1. What is different about state data collection as opposed to all other forms? (Chinese silk)
  2. What counts as the state, anyway?
  3. What restraints should we place on state surveillance?

Privacy as a human right

The right of the people to be secure in their persons, houses, papers, and effects, against unreasonable searches and seizures, shall not be violated, and no Warrants shall issue, but upon probable cause, supported by Oath or affirmation, and particularly describing the place to be searched, and the persons or things to be seized.

Privacy as a human right

Griswold v. Connecticut, 1965.

Various guarantees create zones of privacy. The right of association contained in the penumbra of the First Amendment is one, as we have seen. The Third Amendment, in its prohibition against the quartering of soldiers “in any house” in time of peace without the consent of the owner, is another facet of that privacy. The Fourth Amendment explicitly affirms the “right of the people to be secure in their persons, houses, papers, and effects, against unreasonable searches and seizures.” The Fifth Amendment, in its Self-Incrimination Clause, enables the citizen to create a zone of privacy which government may not force him to surrender to his detriment.

Snowden, guardian interview

Google Encryption

Is this shocking?

Snowden, 2015, Fort Greene Park

Snowden, 2015, Fort Greene Park

Snowden, 2015, Fort Greene Park

Data Science