Category Archives: Uncategorized

Keep the "Info" Before the "Graphic"

The term “infographic” is a ridiculous little buzzword that really took off on the internet sometime last year. It used to refer to genuinely useful things like subway maps and blueprints. Recently, however, the term has come to mean “an obnoxiously oversized image that has numbers on it”. My problem isn’t with infographics like these ones that just display some fun, meaningless information is a visual way, or this one that displays a phenomenon that is inherently visual. My beef is with infographics that reduce a variety of related statistics to an oversized mess of overlapping graphs and charts that are (purposely or otherwise) misleading.

This post will present four rules that infographic designers, if they decide that they absolutely must make an infographic, should always follow (but often don’t). To get the ball rolling, let’s consider an example that made its way around the internet just a couple of weeks ago (source):

American 2009 Season Premieres and Averages to Date (click to enlarge)

We are told that the above infographic depicts the US viewership for a variety of shows during their premiere (light red) and on average since they began their 2009 season (dark red). However, I have two main problems with the image, and they’re both problems that are prevalent throughout many infographics and can easily be solved by just using a simple bar graph.

1. Infographics should not require horizontal scrolling. The above infographic is 3133 pixels wide, which means there is no consumer-available monitor in the world capable of displaying the entire image on one screen without scrunching it down. This is apparently exactly what infographic makers want, since they all seem to subscribe to the school of thought that dictates their image deserves 45 inches of horizontal viewing space. This would be fine if infographics were readable when zoomed out, but by their very nature they almost never are.

Computer monitors were not meant to view posters. If you want to make the image high-resolution enough that it can be printed out as a poster then it should be created as a vector graphic, not a raster graphic. If you still insist that your infographic should be a monstrously large bitmap, make it readable from a zoom level that will fit on standard monitor resolutions.

Some other popular infographics that suffer from this problem are the new auto industry breakdown, weight of the world, and the first 100 days.

2. Two-dimensional figures should never be used to compare linear data. The above infographic compares the number of people watching different shows, so why are circles being used to represent the data? What represents the number of viewers — the radius of the circle or the area of the circle? The source doesn’t tell us, so we have no way of appropriately assessing how many more people are viewing NCIS: Los Angeles than The Good Wife. If it’s the radius of the circle, NCIS appears to have about 5% more viewers. If it’s the area of the circle then it’s probably over 10% (and the discrepancy gets much larger if you compare shows that are farther apart).

Furthermore, even if we were told whether it’s the radii of the circles or their areas that we should be looking at, there’s still a problem. If the radii are what are being compared, then the visual is misleading because the differences in areas cause the relative differences to appear larger than they actually are. If the areas are what are being compared, then it should be noted that people just plain suck at visually comparing areas. By looking at the above image (and not getting out a ruler or anything) can you tell which circles have about half as much area as the NCIS: Los Angeles circle? Can you tell how much higher the viewership of The Good Wife is than that of Glee? I certainly can’t, at least not quickly.

InfomationIsBeautiful.net is a particularly notorious violator of this rule, as these three examples show: deadliest drugs, how safe is the HPV vaccine?, reduce your chances of dying in a plane crash (scroll down to the “bad month” and “the odds” sections). What’s worse is they aren’t even consistent with whether it’s the areas of the circles or the radii of the circles they’re comparing.

Problems #1 and #2 can both be rectified by simply turning the data into a bar graph. A plain old-fashioned bar graph. Voila:

American 2009 Season Premieres and Averages to Date (easier to read)

The above bar graph doesn’t need to be zoomed in to be read, it makes it easier to compare the relative viewership of each show, and it actually contains more data than the previous infographic thanks to the labels on the vertical axis.

The next example (source) supposedly explains how and why low-cost airlines are able to offer flights that are so much cheaper than other airlines. It made its rounds this last spring during recession fever, when anything that had anything to do with something being cheap was instantly popular. While it does not suffer from problem #1 above (since it is readable when zoomed out), it suffers from two instances of problem #2 as well as multiple other problems.

How come cheap airlines are so cheap? (click to enlarge)

3. Infographics (and everything else) should be about substance over style. While there’s no denying that the above infographic is pretty, does it actually tell us anything? Beyond the myriad of small problems such as the average fare of Southwest flights including cents when none of the other numbers do, the misspelling of “Aer Lingus” and “maintenance”, and the mysterious 43% “total advantage” at the bottom that seems to pop out of nowhere, the infographic at its core doesn’t even make sense.

As the infographic itself says, low-cost airlines generally don’t do long-haul flights; they focus on short point-to-point routes. So why are their average fares being compared to the average fares of the likes of British Airways, who regularly do intercontinental flights? Doesn’t it make sense that travel distance makes more of a contribution to the price of the flight than whether or not tickets are sold primarily online? Average fare per kilometer travelled would make more sense to compare, though it would still be misleading because take-off and landing are disproportionately expensive.

Another recent offending infographic that just simply doesn’t say a thing is the $400 million club, which notes that Transformers: Revenge of the Fallen is only the ninth movie in history to gross more than $400 million at the box office in the US during its theatrical run. The infographic then compares the other eight movies, which of course are juggernauts like Star Wars and Titanic. The problem is that none of the figures are adjusted for inflation. If you scale the numbers properly, Transformers: Revenge of the Fallen actually comes in as about the 65th highest-grossing movie. Impressive, sure, but to say that the infographic is misleading is an understatement.

I will finish by presenting a graphic that ran on NewsWeek.com that shows obesity and “life evaluation” trends over the last year or two. It’s debatable whether or not it falls into the category of what most people would consider an “infographic”, but it perfectly illustrates a core problem with them.

4. Be careful with your data. Just making your graphic pretty doesn’t give you free reign to ignore basic statistical principles when presenting data. In the above graphic, the left graph shows two lines — one showing how many people have BMI less than or equal to 30 in a given month and one showing how many people have BMI over 30 in a given month. I have a news flash for you, NewsWeek: one of those lines is redundant. Not only that, but the redundant second line manipulates the reader by giving the false impression that the number of obese people is converging toward the number of non-obese people. Nevermind the fact that the vertical scale is completely out of whack and it jumps a vertical distance of 46.4% in the same amount of space that is used to represent about a 2.5% jump elsewhere.

I’m willing to bet that the vertical scale on the right graph is completely out of whack too, but it’s a little difficult to tell since they don’t tell you what percentages any of the intermediate y-values correspond to. On the blue “struggling” line, we are given a value of 48.4% on the left edge of the graph and a value of 49.6% at the right edge of the graph at a nearly identical height. Are we supposed to be able to tell how high and low the peaks in the middle of the graph are based on that? Does the blue line get as low as 40%? 35%? 30%? Would labels along the vertical axis (similar to the bar graph I showed above) really have detracted from the desired aesthetic too much?

So if you have a set of data that you wish to convey graphically, please first consider whether or not it can be presented by a simple bar graph or line graph. If it can, don’t try to make it more complicated than that. If it can’t, at least make sure that the information is the motivating factor in your decisions. If the layout ends up dictating how you present your data, you’ve got your priorities backward.

Approximating the Distribution of Schmidt Vector Norms

Space	k	Mean	Median	Std. Dev.
C³ ⊗ C³	1	0.8494	0.8516	0.0554
2	0.9811	0.9860	0.0171
C⁴ ⊗ C⁴	1	0.7799	0.7792	0.0501
2	0.9411	0.9435	0.0247
3	0.9921	0.9943	0.0074
C⁵ ⊗ C⁵	1	0.7240	0.7225	0.0444
2	0.8976	0.8987	0.0268
3	0.9707	0.9722	0.0129
4	0.9960	0.9971	0.0039

Spaceship Speed Limits in "B3" Life-Like Cellular Automata

11 Replies

Those of you familiar with Conway’s Game of Life probably know of its two most basic spaceships: the glider and the lightweight spaceship (shown below). The glider travels diagonally by one cell every four generations (and thus its speed is said to be “c/4”) and the lightweight spaceship travels orthogonally by two cells every four generations (and so its speed is denoted by “2c/4” or “c/2”).

The glider

Lightweight spaceship

A natural question to ask is whether or not there are any spaceships that travel faster than c/4 diagonally or c/2 orthogonally. John Conway proved in 1970 (very shortly after inventing the Game of Life) that the answer is no. I present this proof here, since it’s a bit difficult to find online (though Dave Greene was kind enough to post a copy of it on the ConwayLife.com forums).

Theorem 1. The maximum speed that a spaceship can travel in Conway’s Game of Life is c/4 diagonally and c/2 orthogonally.

Proof. We begin by proving the c/4 speed limit for diagonal spaceships. Consider the grid given in Figure 1 (below). If the spaceship is on and to the left of the diagonal line of cells defined by A, B, C, D, and E in generation 0, then suppose that cell X can be alive in generation 2.

Figure 1: The spaceship is to the left of A, B, C, D, and E

Well, if cell X is alive in generation 2, then cells C, U, and V must be alive in generation 1. This means that U and V must have had 3 alive neighbours in generation 0, so each of B, C, D, J, and K must be alive in generation 0. This means that C must have at least four live neighbours in generation 0 though, so there is no way for it to survive to generation 1, which gives a contradiction.

It follows that X can not be alive in generation 2. In other words, if the spaceship is behind the diagonal line A, B, C, D, E in generation 0, then it must be behind the diagonal line defined by U and V in generation 2. It follows that can not travel faster than c/4 diagonally.

To see the corresponding result for orthogonal spaceships, just use two diagonal lines as in Figure 2. If a spaceship is on and below the diagonal lines defined by the solid black cells in generation 0, then we already saw that it must be on and below the diagonal lines defined by the striped cells in generation 2. It follows that it can not travel faster than c/2 orthogonally.

Figure 2: The spaceship is on and below the solid black cells in generation 0

Notice that this result doesn’t only apply to spaceships, but also to other configurations that are (initially) finite and travel across the grid, such as puffers and wickstretchers. Also, this result applies to many Life-like cellular automata — not just Conway’s Game of Life.

In particular, these speed limits apply to any of the 2¹² = 4096 Life-like cellular automata in the range B3/S – B345678/S0123678. That is, these speed limits apply to any rule on the 2D square lattice such that birth occurs for 3 neighbours but not 0, 1, or 2 neighbours, and survival does not occur for 4 or 5 neighbours. But are the spaceship speed limits attained in each of these rules? The regular c/4 glider only works in the 2⁸ = 256 rules from B3/S23 – B3678/S0235678. In the remaining rules, not much is known; some of them have c/3 orthogonal spaceships, some have c/5 orthogonal spaceships, and some have no spaceships at all (such as any of the rules containing S0123, which can not contain spaceships because the trailing edge of the spaceship could never die). Of particular interest are the sidewinder and this spaceship, which play the c/4 diagonal and c/2 orthogonal roles of the glider and lightweight spaceship, respectively, in B3/S13 (as well as several other rules).

So what about the other B3 (but not B0, B1, or B2) rules? If cells survive when they have 4 or 5 cells, then it’s conceivable that spaceships might be able to travel faster than c/4 diagonally or c/2 orthogonally because Theorem 1 does not apply to them. It turns out that they indeed can travel faster diagonally, but somewhat surprisingly they can not travel faster orthogonally.

Theorem 2. In any Life-like cellular automaton in which birth occurs when a cell has 3 live neighbours but not 0, 1, or 2 live neighbours, the maximum speed that a spaceship can travel is c/3 diagonally and c/2 orthogonally.

Proof. The trick here is to consider lines of slope -1/2 as in Figure 3 below. It is possible (though a bit more complicated) to prove the c/3 diagonal speed limit using a diagonal line as in Figure 1 for Theorem 1, but the orthogonal speed limit that results is 2c/3. What is presented here is the only method I know of proving both the diagonal speed limit of c/3 and the orthogonal speed limit of c/2.

Figure 3: The spaceship is below A, B, C, D, E, and F in generation 0

Suppose that a spaceship is on and below the line defined by the cells A, B, C, D, E, and F in Figure 3 in generation 0. It is clear that Y can not be alive in generation 2, since its only neighbour that could possibly be alive in generation 1 is K. Similarly, X can not be alive in generation 2 because its only neighbours that can be alive in generation 1 are B and K. It follows that in generation 2, the spaceship can not be more than 1 cell above the line A, B, C, D, E, F.

More mathematically, this tells us that the maximum speed of a spaceship that travels x cells horizontally for every y cells vertically can not travel faster than max{x,y}c/(x+2y). Taking x = y = 1 (diagonal spaceships) gives a speed limit of c/3. Taking x = 0, y = 1 (orthogonal spaceships) gives a speed limit of c/2.

Finally, it should be noted that even though these spaceship speed upper bounds apply to a wide variety of different rules, many rules don’t even have spaceships (even relatively simple rules containing B3 in their rulestring). For example, no spaceships are currently known in the rule “maze” (B3/S12345), and it seems quite believable that there are no spaceships to be found in that rule. I would love to see a proof that maze contains no spaceships, but it seems that there are too many cases to check by hand. I may end up trying a computer proof sometime in the near future.

The Equivalences of the Choi-Jamiolkowski Isomorphism (Part II)

1 Reply

This is a continuation of this post.
Please read that post to learn what the Choi-Jamiolkowski isomorphism is.

In part 1, we learned about hermicity-preserving linear maps, positive maps, k-positive maps, and completely positive maps. Now let’s see what other types of linear maps have interesting equivalences through the Choi-Jamiolkowski isomorphism. Recall that the notation C_Φ is used to represent the Choi matrix of the linear map Φ.

6. Entanglement Breaking Maps / Separable Quantum States

An entanglement breaking map is defined as a completely positive map Φ with the property that (id_n ⊗ Φ)(ρ) is a separable quantum state whenever ρ is a quantum state (i.e., a density operator). A separable quantum state σ is one that can be written in the form

$\sigma=\sum_ip_i\sigma_i\otimes\tau_i,$

where {p_i} forms a probability distribution (i.e., p_i ≥ 0 for all i and the p_i‘s sum to 1) and each σ_i and τ_i is a density operator. It turns out that the Choi-Jamiolkowski equivalence for entanglement-breaking maps is very natural — Φ is entanglement breaking if and only if C_Φ is separable. Because it is known that determining whether or not a given state is separable is NP-HARD [1], it follows that determining whether or not a given linear map is entanglement breaking is also NP-HARD. Nonetheless, there are several nice characterizations of entanglement breaking maps. For example, Φ is entanglement breaking if and only if it can be written in the form

$\Phi(X)=\sum_iA_iXA_i^*,$

where each operator A_i has rank 1 (recall from Section 4 of the previous post that every completely positive map can be written in this form for some operators A_i — the rank 1 condition is what makes the map entanglement breaking). For more properties of entanglement breaking maps, the interested reader is encouraged to read [2].

7. k-Partially Entanglement Breaking Maps / Quantum States with Schmidt Number at Most k

The natural generalization of entanglement breaking maps are k-partially entanglement breaking maps, which are completely positive maps Φ with the property that (id_n ⊗ Φ)(ρ) always has Schmidt number [3] at most k for any density operator ρ. Recall that an operator has Schmidt number 1 if and only if it is separable, so the k = 1 case recovers exactly the entanglement breaking maps of Section 6. The set of operators associated with the k-partially entanglement breaking maps via the Choi-Jamiolkowski isomorphism are exactly what we would expect: the operators with Schmidt number no larger than k. In fact, pretty much all of the properties of entanglement breaking maps generalize in a completely natural way to this situation. For example, a map is k-partially entanglement breaking if and only if it can be written in the form

$\Phi(X)=\sum_iA_iXA_i^*,$

where each operator A_i has rank no greater than k. For more information about k-partially entanglement breaking maps, the interested reader is pointed to [4]. Additionally, there is an interesting geometric relationship between k-positive maps (see Section 5 of the previous post) and k-partially entanglement breaking maps that is explored in this note and in [5].

8. Unital Maps / Operators with Left Partial Trace Equal to Identity

A linear map Φ is said to be unital if it sends the identity operator to the identity operator — that is, if Φ(I_n) = I_m. It is a simple exercise in linear algebra to show that Φ is unital if and only if

${\rm Tr}_1(C_\Phi)=I_m,$

where Tr₁ denotes the partial trace over the first subsystem. In fact, it is not difficult to show that Tr₁(C_Φ) always equals exactly Φ(I_n).

9. Trace-Preserving Maps / Operators with Right Partial Trace Equal to Identity

In quantum information theory, maps that are trace-preserving (i.e., maps Φ such that Tr(Φ(X)) = Tr(X) for every operator X ∈ M_n) are of particular interest because quantum channels are modeled by completely positive trace-preserving maps (see Section 4 of the previous post to learn about completely positive maps). Well, some simple linear algebra shows that the map Φ is trace-preserving if and only if

${\rm Tr}_2(C_\Phi)=I_n,$

where Tr₂ denotes the partial trace over the second subsystem. The reason for the close relationship between this property and the property of Section 8 is that unital maps and trace-preserving maps are dual to each other in the Hilbert-Schmidt inner product.

10. Completely Co-Positive Maps / Positive Partial Transpose Operators

A map Φ such that T○Φ is completely positive, where T represents the transpose map, is called a completely co-positive map. Thanks to Section 4 of the previous post, we know that Φ is completely co-positive if and only if the Choi matrix of T○Φ is positive semi-definite. Another way of saying this is that

$(id_n\otimes T)(C_\Phi)\geq 0.$

This condition says that the operator C_Φ has positive partial transpose (or PPT), a property that is of great interest in quantum information theory because of its connection with the problem of determining whether or not a given quantum state is separable. In particular, any quantum state that is separable must have positive partial transpose (a condition that has become known as the Peres-Horodecki criterion). If n = 2 and m ≤ 3, then the converse is also true: any PPT state is necessarily separable [6]. It follows via our equivalences of Sections 4 and 6 that any entanglement breaking map is necessarily completely co-positive. Conversely, if n = 2 and m ≤ 3 then any map that is both completely positive and completely co-positive must be entanglement breaking.

11. Entanglement Binding Maps / Bound Entangled States

A bound entangled state is a state that is entangled (i.e., not separable) yet can not be transformed via local operations and classical communication to a pure maximally entangled state. In other words, they are entangled but have zero distillable entanglement. Currently, the only states that are known to be bound entangled are states with positive partial transpose — it is an open question whether or not other such states exist.

An entanglement binding map [7] is a completely positive map Φ such that (id_n ⊗ Φ)(ρ) is bound entangled for any quantum state ρ. It turns out that a map is entanglement binding if and only if its Choi matrix C_Φ is bound entangled. Thus, via the result of Section 10 we see that a map is entanglement binding if it is both completely positive and completely co-positive. It is currently unknown if there exist other entanglement binding maps.

References:

L. Gurvits, Classical deterministic complexity of Edmonds’ Problem and quantum entanglement, Proceedings of the thirty-fifth annual ACM symposium on Theory of computing, 10-19 (2003). arXiv:quant-ph/0303055v1
M. Horodecki, P. W. Shor, M. B. Ruskai, General Entanglement Breaking Channels, Rev. Math. Phys 15, 629–641 (2003). arXiv:quant-ph/0302031v2
B. Terhal, P. Horodecki, A Schmidt number for density matrices, Phys. Rev. A Rapid Communications Vol. 61, 040301 (2000). arXiv:quant-ph/9911117v4
D. Chruscinski, A. Kossakowski, On partially entanglement breaking channels, Open Sys. Information Dyn. 13, 17–26 (2006). arXiv:quant-ph/0511244v1
L. Skowronek, E. Stormer, K. Zyczkowski, Cones of positive maps and their duality relations, J. Math. Phys. 50, 062106 (2009). arXiv:0902.4877v1 [quant-ph]
M. Horodecki, P. Horodecki, R. Horodecki, Separability of Mixed States: Necessary and Sufficient Conditions, Physics Letters A 223, 1–8 (1996). arXiv:quant-ph/9605038v2
P. Horodecki, M. Horodecki, R. Horodecki, Binding entanglement channels, J.Mod.Opt. 47, 347–354 (2000). arXiv:quant-ph/9904092v1

The Equivalences of the Choi-Jamiolkowski Isomorphism (Part I)

206 Replies

The Choi-Jamiolkowski isomorphism is an isomorphism between linear maps from M_n to M_m and operators living in the tensor product space M_n ⊗ M_m. Given any linear map Φ : M_n → M_m, we can define the Choi matrix of Φ to be

$C_\Phi:=\sum_{i,j=1}^n|e_i\rangle\langle e_j|\otimes\Phi(|e_i\rangle\langle e_j|),\text{ where }\big\{|e_i\rangle\big\}\text{ is an orthonormal basis of $\mathbb{C}^n$}.$

It turns out that this association between Φ and C_Φ defines an isomorphism, which has become known as the Choi-Jamiolkowski isomorphism. Because much is already known about linear operators, the Choi-Jamiolkowski isomorphism provides a simple way of studying linear maps on operators — just study the associated linear operators instead. Thus, since there does not seem to be a list compiled anywhere of all of the known associations through this isomorphism, I figure I might as well start one here. I’m planning on this being a two-parter post because there’s a lot to be said.

1. All Linear Maps / All Operators

By the very fact that we’re talking about an isomorphism, it follows that the set of all linear maps from M_n to M_m corresponds to the set of all linear operators in M_n ⊗ M_m. One can then use the singular value decomposition on the Choi matrix of the linear map Φ to see that we can find sets of operators {A_i} and {B_i} such that

$\Phi(X)=\sum_iA_iXB_i.$

To construct the operators A_i and B_i, simply reshape the left singular vectors and right singular vectors of the Choi matrix and multiply the A_i operators by the corresponding singular values. An alternative (and much more mathematically-heavy) method of proving this representation of Φ is to use the Generalized Stinespring Dilation Theorem [1, Theorem 8.4].

2. Hermicity-Preserving Maps / Hermitian Operators

The set of Hermicity-Preserving linear maps (that is, maps Φ such that Φ(X) is Hermitian whenever X is Hermitian) corresponds to the set of Hermitian operators. By using the spectral decomposition theorem on C_Φ and recalling that Hermitian operators have real eigenvalues, it follows that there are real constants {λ_i} such that

$\Phi(X)=\sum_i\lambda_iA_iXA_i^*.$ Again, the trick is to construct each A_i so that the vectorization of A_i is the i^th eigenvector of C_Φ and λ_i is the corresponding eigenvalue. Because every Hermitian operator can be written as the difference of two positive semidefinite operators, it is a simple corollary that every Hermicity-Preserving Map can be written as the difference of two completely positive linear maps — this will become more clear after Section 4. It is also clear that we can absorb the magnitude of the constant λ_i into the operator A_i, so we can write any Hermicity-preserving linear map in the form above, where each λ_i = ±1.

3. Positive Maps / Block Positive Operators

A linear map Φ is said to be positive if Φ(X) is positive semidefinite whenever X is positive semidefinite. A useful characterization of these maps is still out of reach and is currently a very active area of research in quantum information science and operator theory. The associated operators C_Φ are those that satisfy

In terms of quantum information, these operators are positive on separable states. In the world of operator theory, these operators are usually referred to as block positive operators. As of yet we do not have a quick deterministic method of testing whether or not an operator is block positive (and thus we do not have a quick deterministic way of testing whether or not a linear map is positive).

4. Completely Positive Maps / Positive Semidefinite Operators

The most famous class of linear maps in quantum information science, completely positive maps are maps Φ such that (id_k ⊗ Φ) is a positive map for any natural number k. That is, even if there is an ancillary system of arbitrary dimension, the map still preserves positivity. These maps were characterized in terms of their Choi matrix in the early ’70s [2], and it turns out that Φ is completely positive if and only if C_Φ is positive semidefinite. It follows from the spectral decomposition theorem (much like in Section 2) that Φ can be written as

$\Phi(X)=\sum_iA_iXA_i^*.$

Again, the A_i operators (which are known as Kraus operators) are obtained by reshaping the eigenvectors of C_Φ. It also follows (and was proved by Choi) that Φ is completely positive if and only if (id_n ⊗ Φ) is positive. Also note that, as there exists an orthonormal basis of eigenvectors of C_Φ, the A_i operators can be constructed so that Tr(A_i^*A_j) = δ_ij, the Kronecker delta. An alternative method of deriving the representation of Φ(X) is to use the Stinespring Dilation Theorem [1, Theorem 4.1] of operator theory.

5. k-Positive Maps / k-Block Positive Operators

Interpolating between the situations of Section 3 and Section 4 are k-positive maps. A map is said to be k-positive if (id_k ⊗ Φ) is a positive map. Thus, complete positivity of a map Φ is equivalent to Φ being k-positive for all natural numbers k, which is equivalent to Φ being n-positive. Positivity of Φ is the same as 1-positivity of Φ. Since we don’t even have effective methods for determining positivity of linear maps, it makes sense that we don’t have effective methods for determining k-positivity of linear maps, so they are still a fairly active area of research. It is known that Φ is k-positive if and only if

$\langle x|C_\Phi|x\rangle\geq 0\quad\forall\,|x\rangle\text{ with }SR(|x\rangle)\leq k.$

Operators of this type are referred to as k-block positive operators, and SR(x) denotes the Schmidt rank of the vector x. Because a vector has Schmidt rank 1 if and only if it is separable, it follows that this condition reduces to the condition that we saw in Section 3 for positive maps in the k = 1 case. Similarly, since all vectors have Schmidt rank less than or equal to n, it follows that Φ is n-positive if and only if C_Φ is positive semidefinite, which we saw in Section 4.

Update [October 23, 2009]: Part II of this post is now online.

References:

V. I. Paulsen, Completely Bounded Maps and Operator Algebras, Cambridge Studies in Advanced Mathematics 78, Cambridge University Press, Cambridge, 2003.
M.-D. Choi, Completely Positive Linear Maps on Complex Matrices, Lin. Alg. Appl, 285-290 (1975).

IMDb Movie Ratings Over the Years

28 Replies

It’s time for a random dose of statistics courtesy of The Internet Movie Database. Let’s consider all movies that have been released theatrically over the last 60 years and see whether there is a trend in their perceived quality over time. That is, do new movies generally receive higher or lower scores on IMDb than old movies?

Before looking at the numbers though, we need some rules to clarify what types of movies we are considering:

We only consider theatrically-released films — no straight-to-video movies or TV movies.
Short films that were released theatrically (such as Pixar’s Presto) are included.
We only consider movies that have received 1000 or more votes. This restriction is to prevent movies with only a handful of votes from skewing the results too much.
The theatrical release date of the movie must have been at least as recent at 1950.

IMDb contains 10034 movies that satisfy the above criteria. The average score (on a scale of 1 to 10) of those movies is 6.38 and the median score is 6.6. The average score per release year is given by the following graph:

IMDb Ratings

As you can see, older movies (1950 – 1975) have abnormally high scores, as do very recent movies (2000 – 2009). These differences are indeed statistically significant. For example, the p-value associated with the test that the mean score in 1950 is the same as the mean score in 1989 is less than 10^-19. The p-value associated with the test that the mean score in 2008 is the same as the mean score in 1989 is about 0.0021. Other nearby years give similar p-values.

So this tells us that, in general, particularly old movies receive the highest scores, followed by newly-released movies, followed by “semi-old” movies from the 1980’s and 1990’s. So why the differences? Were movies from the 1980’s really just that bad? Possibly, but the more likely explanation is that movies from the 1950’s through 1970’s have artificially higher scores because people don’t generally go back and watch the crummy movies of the last generation, so they get forgotten and do not have 1000 votes on IMDb. Will people be watching Disaster Movie in forty years? I sure hope not.

On the other hand, particularly recent movies tend to draw a fair amount of hype and fanboyism. Remember when The Dark Knight had a score of 9.8 and was at #1 on the IMDb top 250? Now, one year later, it has a score of 8.9 and is located at #9 on the top 250. It will likely dwindle a little further down over the coming years as well.

The Best and Worst of Each Year

While we’re looking at ratings of movies over the years, I suppose I might as well provide a list of the best and worst movie of each year (based on the votes of IMDb users), since such a list is not available on the IMDb website itself to my knowledge. Keep in mind that, as before, only movies with 1000 or more votes are considered. Enjoy!

Year	Best	Worst
1950	Sunset Blvd.	Destination Moon
1951	Strangers on a Train	Flying Padre: An RKO-Pathe Screenliner
1952	Singin’ in the Rain	Jack and the Beanstalk
1953	Duck Amuck	Robot Monster
1954	Rear Window	Jail Bait
1955	Nuit et brouillard	Bride of the Monster
1956	The Killing	The Conqueror
1957	12 Angry Men	Beginning of the End
1958	Vertigo	The Screaming Skull
1959	North by Northwest	Yusei oji
1960	Psycho	Ein Toter hing im Netz
1961	Divorzio all’italiana	The Beast of Yucca Flats
1962	Lawrence of Arabia	Eegah
1963	The Great Escape	The Skydivers
1964	Dr. Strangelove or: How I Learned to Stop Worrying and Love the Bomb	The Starfighters
1965	Per qualche dollaro in più	Monster a-Go Go
1966	Il buono, il brutto, il cattivo.	Night Train to Mundo Fine
1967	Cool Hand Luke	The Hellcats
1968	C’era una volta il West	Girl in Gold Boots
1969	Le chagrin et la pitié	Five the Hard Way
1970	Mihai Viteazul	Hercules in New York
1971	12 stulyev	The Touch of Satan
1972	The Godfather	Night of the Lepus
1973	The Sting	Gojira tai Megaro
1974	The Godfather: Part II	The Bat People
1975	Hababam sinifi	Zaat
1976	Tosun Pasa	Track of the Moon Beast
1977	Saban Oglu Saban	The Incredible Melting Man
1978	Kibar Feyzo	Laserblast
1979	Apocalypse Now	Angels’ Brigade
1980	Star Wars: Episode V – The Empire Strikes Back	L’uomo puma
1981	Raiders of the Lost Ark	Le lac des morts vivants
1982	Vincent	Megaforce
1983	Jaane Bhi Do Yaaro	Los nuevos extraterrestres
1984	Balkanski spijun	Ator l’invincibile 2
1985	Esperando la carroza	Final Justice
1986	Aliens	Zombie Nightmare
1987	L’homme qui plantait des arbres	Leonard Part 6
1988	Nuovo cinema Paradiso	Hobgoblins
1989	Ilha das Flores	R.O.T.O.R.
1990	Goodfellas	The Final Sacrifice
1991	The Silence of the Lambs	Cool as Ice
1992	Reservoir Dogs	Meatballs 4
1993	Schindler’s List	Barschel – Mord in Genf?
1994	The Shawshank Redemption	Tangents
1995	The Usual Suspects	Dis – en historie om kjærlighet
1996	Paradise Lost: The Child Murders at Robin Hood Hills	Merlin’s Shop of Mystical Wonders
1997	Masumiyet	Pocket Ninjas
1998	American History X	Die Hard Dracula
1999	Fight Club	The Underground Comedy Movie
2000	Memento	The Tony Blair Witch Project
2001	The Lord of the Rings: The Fellowship of the Ring	Glitter
2002	Cidade de Deus	Ben & Arthur
2003	The Lord of the Rings: The Return of the King	From Justin to Kelly
2004	Eternal Sunshine of the Spotless Mind	Superbabies: Baby Geniuses 2
2005	Babam Ve Oglum	Troppo belli
2006	Kiwi!	Pledge This!
2007	Heima	Ram Gopal Varma Ki Aag
2008	The Dark Knight	Disaster Movie
2009 (so far)	Inglourious Basterds	Jonas Brothers: The 3D Concert Experience

Downloads:

IMDb rating data [.zip of an Excel spreadsheet — 341KB]
IMDb rating data [tab-delimited plaintext — 0.97MB]

Nathaniel Johnston

Quantum information theory, cellular automata, and recreational mathematics