# Nativism vs. Empiricism: Ramifications for Artificial Natural Language Processing

Note: This is an essay I wrote for the subject Philosophy of Cognitive Science that was part of my bachelor’s course. I think it might be interesting to others, so I’ve decided to publish it here. The format is adapted slightly to be more suitable for this blog; the content is unchanged.

In the field of artificial intelligence, humans are often used as prime examples of adaptable agents with general intelligence. The goal of some artificial intelligence researchers is to arrive at an artificial general, or human-level, intelligence. These agents should be able to perform many of the same tasks with the same adaptability as humans are able to. One of the few empirical certainties in the endeavour of creating such intelligent agents is that the natural, human intelligence works. Thus, there is merit to artificial intelligence research that strives to mimic human intelligence by modelling human mechanisms.

An intriguing and far-from-settled debate concerns the origin of human knowledge, skills, abilities and thought in general. The major theories can be identified as lying somewhere between the two extremes of full-blown nativism and full-blown empiricism [1]. Nativistic theorists would argue for innate knowledge; at least some of our capabilities arise from hard-wired pathways in our nervous system that are available at birth. In contrast, empiricists would argue that these capabilities are learned from experience utilizing the brain’s various capacities for learning. For example, a baby’s suckling after birth is likely innate, whereas the behavioural pattern of brushing your teeth is likely learned. It is still unknown which combination of these extremes in this seemingly easy distinction is correct.

When striving to model human capacities in an artificial intelligence, knowing which parts of human intelligence and other capabilities are hard-wired and which parts arise from experiences should be of particular interest to artificial intelligence researchers. In the following, we will look at the innateness (or lack thereof) of language parsing and acquisition. From this, recommendations will be made regarding the high-level design of an artificial natural language processor.

[1] S. Gross and G. Rey, “Innateness,” , 2012.
[Bibtex]
@article{gross2012innateness,
title={Innateness},
author={Gross, Steven and Rey, Georges},
year={2012}
}

# Introduction to Artificial Neural Networks

An artificial neural network (ANN) is a type of machine learning model. It is made up of a number of simple parts called units, or neurons. By combining a large amount of these simple units, ANNs can solve real-world problems. For example, the main network that was used in my bachelor thesis research consisted of over 12,000 units. The name artificial neural network is slightly misleading: they’re mostly related to biological neural networks through the fact that both artificial and natural neural networks are made up of simple parts. Other than that they’re quite unrelated.

# Decreasing Information

During the training of several neural networks for my bachelor’s thesis (more on that later, maybe!) I noticed something fun. The used networks’ weights (in this case classification function parameters) are initialized with numbers drawn from the standard normal distribution, meaning the initial network state is random. Such randomness by its very nature has no actual structure, and thus has high entropy. This means that compressing the information to save it to disk is less effective than on other, more structured, information. Initially, saving one such network’s weights required approximately 19.5MB of disk space.

As the networks’ training progressed, the file sizes shrunk! After a day of training, the space required for saving this network’s weights had decreased to 18.0MB; a 1.5MB decrease from the original value. I hadn’t thought about it before, but once I noticed it I soon realized what was happening. The whole act of training networks is exercised to find structure in data. A neural network does this by learning some sort of representation of the data through continuously updating its weights while training — in other words, the weights are getting more structured as the network is getting smarter! When the weights’ structure increases the entropy decreases, making compression more effective and our disks happier. Or unhappier, perhaps.

# Rocket Fuel Requirements Revisited

In a previous post we looked at the fuel requirements for rockets to reach escape velocity. We calculated the fuel requirements using the rocket equation. This equation takes into account the conservation of momentum. However, momentum is not the only property influencing the velocity of the rocket during a launch.

Rockets expel their fuel over time. During this time, the rocket is pulled back due to gravity. Only if a rocket could instantaneously expel all of its fuel, and when ignoring atmospheric drag, the escape velocity would be reached instantaneously and the equation would hold.

Taking the burn-time and gravity into account yields a difficult differential equation. We can implement that equation in a computer program to simulate the launch.

# Black Holes

A black hole is an object from which nothing, including light, is able to escape. As nothing can go faster than light, this can be more formally defined as an object for which the escape velocity is greater than the speed of light. In my post about escape velocity we found an equation relating the velocity required to escape from the gravitational pull of an object and that object’s mass.

v = \sqrt{2G \frac{M}{r}}

In this equation, v is the escape velocity, M is the mass of the object we want to escape from and r is the distance from the center of mass of the object we’re escaping from. G is the gravitational constant.

If the required velocity v is greater than the speed of light, c, even light will not able to escape the object, making it a black hole. We have two variables, M and r, so we can derive two equations from this. One equation gives the distance from the center of mass required to make escaping from the object impossible given the mass, and the other gives the mass required given the distance from the center of mass.

\begin{aligned}
\sqrt{2G \frac{M}{r}} &= c \\
2G \frac{M}{r} &= c^2 \\
\frac{M}{r} &= \frac{c^2}{2G} \\
M &= \frac{r c^2}{2G} \\
r &= \frac{2GM}{c^2}
\end{aligned}

The latter equation, r = \frac{2GM}{c^2}, is known as the Schwarzschild radius. It is the radius of the perfect sphere around the center of mass of the object, such that if all the mass is within that sphere the resulting escape velocity is equal to the speed of light. In other words, if the object were smaller than this, it would become a black hole. For Earth, the radius is slightly surprising:

\begin{aligned}
r &= \frac{2G M_{\oplus}}{c^2} \\
&= \frac{2 \cdot 6.67 \cdot 10^{-11} \cdot 5.97 \cdot 10^{24}}{3.00 \cdot 10^8} \approx 8.87 \text{ mm}
\end{aligned}

So, Earth would only become a black hole if it was compressed to the size of a marble. A black hole can be smaller than its Swartchzschild radius, however. In this case, the radius acts as the event horizon of the black hole: matter, or information, inside the radius would not necessarily be inside the black hole itself, but it would no longer be able to escape to outside the event horizon. In other words: everything that happens inside the event horizon of a black hole, is invisible to outside observers.

# Escape Velocity (And: How Much Fuel Do Rockets Need?)

During the launch of a rocket, the Earth’s gravitational field is pulling the rocket back. The rocket needs a certain speed to be able to escape from the Earth’s gravitational field, such that it won’t fall back to Earth nor get into an orbit around it. Escape velocity is the speed a rocket requires to be able to escape from a body without having to burn more fuel later during the maneuver. For a body as massive as Earth, the required velocity is relatively high, and this is why rockets literally need tonnes of fuel.

In this post, by making a few simplifications and using the rocket equation that we found earlier, we will derive an equation to calculate the amount of propellant needed to escape from Earth.

# The Rocket Equation

Rockets in space, like all other objects, have to accelerate to change velocity. But space is a vacuum, so there is nothing to push against to create force. Instead, rockets accelerate by using the conservation of momentum. The momentum of an object is equal to the object’s mass multiplied by the object’s velocity: \vec{p} = m \vec{v}. In a closed system, the total momentum remains constant: \vec{p}_{0} = \vec{p}_{t}.

A rocket carries propellant that it expels at high velocities to accelerate. Imagine a rocket moving in space; at first the rocket is not expelling propellant and so its momentum does not change. Then it expels a part of its propellant. That propellant’s momentum is equal to its mass multiplied by its velocity. The rocket and propellant are part of a closed system, so the momentum of the rocket has to change such that the total momentum (that of the rocket plus that of the propellant) is equal to the momentum of the rocket before it expelled the propellant. As a result, the rocket gains velocity in direction opposite to that of the propellant.

Let’s find out how much velocity the rocket gains!

# Shorts: Vertical Travel Distance of Bouncing Objects

If a ball is dropped from a height of 10 meters, and on each bounce it reaches a maximal height of 0.75 times the previous height, then what is the total distance traveled? If we use h_i as the maximal height reached on bounce i, h_0 as the initial height, and $latex b$ as the factor of the maximal height achieved relative to the previous bounce, we have h_i = b h_{i-1}.

Then, the total distance traveled is:

\begin{aligned}
d &= h_0 + 2 h_1 + 2 h_2 + 2 h_3 + …
\\
&= h_0 + 2 b h_0 + 2 b h_1 + 2 b h_2 + …
\\
&= h_0 + 2 b h_0 + 2 b b h_0 + 2 b b b h_0 + …
\\
&= h_0 + 2 b h_0 + 2 b^2 h_0 + 2 b^3 h_0 + …
\\
&= -h_0 + 2 (h_0 + b h_0 + b^2 h_0 + b^3 h_0 + …)
\\
&= -h_0 + 2 h_0 (1 + b + b^2 + b^3 + …)
\\
&= -h_0 + 2 h_0 \sum_i \left(b^i\right)
\end{aligned}

Using \sum_n \left(x^n\right) = \frac{1}{1-x} if 0 \le x < 1, we get:

d = -h_0 + 2 h_0 \frac{1}{1-b}

Now we plug our values of h_0 = 10 \text{ m} and b = 0.75 into this equation to find d = -10 + 20 \cdot \frac{1}{0.25} = 70 meters. So, even though our ball will bounce on for eternity, it will travel exactly 70 meters!

# Can an Object with Constant Speed Be Accelerating?

An interesting question is whether an object with a constant speed can still be accelerating, and intuitively the answer would be “no”: acceleration means the object is speeding up or slowing down, right? Apparently, this is not the case. It actually is possible to have a constant speed while still having an acceleration. In this post, we will look at how this is possible. Hint: circular motion!

# Relative Velocity (Or: The Velocity-Addition Formula)

With the knowledge that the speed of light is constant and the same for every reference frame, and that no object can travel at the speed of light or faster than the speed of light in any reference frame, we ask ourselves the question: what happens when two spaceships leave from a space station in opposite directions and both reach a constant speed of 0.95c (with c the speed of light) relative to that space station? In the space station’s reference frame, the two ships travel at 2 \cdot 0.95c = 1.9c relative to each other. This doesn’t violate relativity, as neither spaceship is actually going at or faster than the speed of light. But, ignoring the effects of relativity, in the reference frame of either spaceship, the other ship would appear to travel at 1.9c. This would violate relativity!

Applying the equations we found before, we will find out what actually happens.