------------------------------------- How are matrices and tensors related? ------------------------------------- Mathematicians and physicist differ in the notation used for vectors, tensors, matrices, and multilinear forms. Here is a dictionary. T^q = tensor product of q copies of the vector space T; in particular, T^0=S is the algebra of scalar fields and T^1=T. T^p_q = space of all linear mappings from T^q to T^p; elements are (p,q)-tensors with p upper and q lower indices. T_0^q = T^q T_p0 =: T_p = (T^p)^* is the so-called dual space of T^q; in particular, T_1 = T^* is the dual space of T; its elements are the linear forms = covectors. One can associate with every A in T_p^q canonically a multilinear mapping B: T_q tensor T^p --> S with B(s,t) = t(As) for s in T^q, t in T_p, and conversely; indeed, since the image As of s under A is in T^p, its image t(As) is a well-defined scalar. Using the B's in place of the A's gives an alternative way of defining tensors, although one less convenient for visualization. Given a basis on T and a dual cobasis on T^*, one can use coordinates. Then physicists write - elements of T as vectors = column vectors with an upper index, - elements of T^* as linear forms = 1-forms = covectors = row vectors with a lower index, - elements of T^q as multivectors with q upper indices, - elements of T_p as multicovectors with p lower indices, - elements of T_p as mixed multi/co/vectors with p lower and q upper indices. (There is also a dual version of this, where vector are considered as rows and covectors as columns. The remainder then changes accordingly.) In particular. (0,0)-tensor = scalar, (1,0)-tensor = vector (vector in T=T^1) = column vektor, (0,1)-tensor = covector (vector in the dual space T^*=T_1) = row vector, (1,1)-tensor = matrix (linear mapping from T to T). Clearly, the columns of the matrix A_i^k are column vectors = vectors, the rows are row vectors = covectors, and the indexing is consistent. The requirement that basis and cobasis are dual is equivalent to the statement that for every vector u and covector w (i.e., linear mapping from vectors to scalars), w(u) = w_i u^i; here the Einstein convention is used that formulas involving pairs of equally labelled indices, one of them a lower index and the other an upper index must be interpreted as a sum over these indices. Mathematicians using linear algebra (where no tensors of order >2 appear) write instead all indices as lower indices, no matter whether they belong to row vectors, column vectors, or matrices. They also write all sums explicitly, consider all vectors given by a single letter as column vectors, and write covectors (1-forms) explicitly using the transposition sign (^T, but statisticians often use a prime ' instead, which is also the form used in Matlab). This has many advantages and allows a simple notation which increases understandability of otherwise long formulas. Phys. notation: s = x^k y_k x vector, y covector Math. notation: s = sum_k y_k x_k or simply s=y^Tx. Phys. notation: y_i = A_i^k x_k x,y vectors, A matrix Math. notation: y_i = sum_k A_ik x_k or simply y=Ax. Phys. notation: s = A_i^i A matrix Math. notation: s = sum_i A_ii or simply s = tr A (trace). Phys. notation: y_i = A_i^j B_j^k x_k x,y vectors, A,B matrices, Math. notation: y_i = sum_jk A_ij B_jk x_k or simply y=ABx. Phys. notation: y_i = A_i^j B_j^k C_k^l D_l^m x_k x,y vectors, A,B,C,D matrices Math. notation: y_i = sum_jklm A_ij B_jk C_kl D_lm x_k or simply y=ABCDx. The linear algebra notation is compact and index-free, in spite of the fact that coordinates are being used. For higher order tensors, the advantages of the linear algebra notation are less pronounced since one has to specify which pairs of indices must be contracted. However, often, an index-free notation is still possible: Phys. notation: A_li = R_ijkl b^j c^k Math. notation: A(u,v) = R(v,b,c,u) Phys. notation: A_l^i = R^i_j^k_l b^j c_k Math. notation: A(u,v^T) = R(v^T,b,c^T,u) Phys. notation: A_i^j = R_i^kkj Math. notation: A = tr_23 R, where the subscripts indicate which indices must be contracted. All this is completely independent of any metric. If a metric = nondegenerate symmetric (0,2)-tensor g is given on T, which associates with u,v in T the scalar g(u,v), one can canonically identify vectors and covectors, at the expense of some confusion if one is not careful. This reads in physicists notation as follows: The metric is g_ik=g_ki (expressing the symmetry), and for every vector u^k, the associated covector is u_i = g_ik u^k. Conversely, one can reconstruct the vector from the covector using u^k = g^ik u_i, where g^ik=g^ki is the inverse metric, a symmetric (2,0)-tensor which for consistency must satisfy the equations g_ij g^kj = delta_i^k (*) with the Kronecker delta delta_i^k = 1 if i=k, = 0 otherwise, which is the identity matrix written as a (1,1)-tensor in index notation. Nondegeneracy is precisely the solvability of (*) for the dual metric. Mathematicians find it confusing to label different objects with the same symbol, and prefer to always distinguish between a vector and its canonically associated covector. Given a basis of T and the dual cobasis of T^*, coordinates (row and column vectors) can be used to define the elements of T and T^*; the metric g in T_2 is represented in these coordinates by an invertible symmetric matrix = (1,1)-tensor G such that g(u,v) = u^TGv for u,v in T. The canonical pairing induced by the metric therefore associates with the vector u the covector w^T = u^TG. (**) Conversely, one can reconstruct from the covector w^T the canonically associated vector u = G^{-1}w. The dual metric therefore maps u^T, v^T to u^TG^{-1}v, and is represented by the inverse matrix G^{-1}. The relation between the physicists form and the linear algebra form of writing things can be inferred from (**) - we simply have Phys. notation: g_ik Math. notation: G = (g_ik) Phys. notation: g^ik Math. notation: G^{-1} = (g^ik) Again, the linear algebra notation is compact and index free, in spite of the fact that coordinates are being used.