diff --git a/interface.md b/interface.md
deleted file mode 100644
index 870903c..0000000
--- a/interface.md
+++ /dev/null
@@ -1,52 +0,0 @@
-# Interface Issues
-
-- Integer size:
-  - Fixed at 32-bit, 64-bit or configurable?
-  - Same for lengths and strides?
-  - Signed or unsigned?
-- Type for index string:
-  - char or integral (char allows string literals in C/C++).
-- Pointers or pass by values for scalars?
-- Return values (norm, dot, etc.) or parameters?
-- Does reduce (idamax, etc.) return the index, the value, or both?
-- Dedicated reductions (max, min, norm2, etc.) or a single function with operation tag?
-- For a mixed-precision interface:
-  - Separate functions?
-  - Type tags?
-  - What is the type of alpha and beta?
-  - Internal computation type?
-  - Accumulation type?
-- Storage format for complex:
-  - Separate real/complex storage nice for some applications.
-- High-level vs. low-level interfaces:
-  - Low-level interface *must* be in ANSI C.
-    - Structs are probably a no-no
-    - Language compatibility (esp with Fortran?)
-    - Doesn't necessarily need to be concise and intuitive
-    - 1st concern
-  - High-level interfaces
-    - Can be tuned to language and/or usage
-    - Use existing interfaces as well (Eigen, MATLAB, Numpy, etc.)
-    - 2nd concern
-- Error checking:
-  - When does it happen?
-  - How much?
-  - Can it be turned off?
-  - What happens on error?
-- Hardware:
-  - CPU only?
-  - Separate interface for GPU or combined? Heterogeneous?
-  - Out-of-core
-  - Threading
-  - Distributed parallelism (maybe)?
-  
-## "Plans"
-  
-Paul Springer incorporates the notion of a "plan" in [HPTT](https://github.com/springer13/hptt), similar to how it is used in e.g. FFTW. The basic idea is that instead of just calling an interface function, one creates an object for the operation. This object can then be executed (possibly more than once) and maniuplated to change the operation parameters.
-  
-Examples of why this is helpful:
-  - Reducing overhead by calling the object multiple times with different pointers
-    - This also can increase the amount of allowable overhead, opening up new analysis/optimization opportunities
-  - Passing around operations for task scheduling, etc. (Lambdas work for this too).
-  - Creating an isolated execution environment
-    - E.g. specifying threading, precision, algorithm, etc. without being affected by global changes
diff --git a/operations.md b/operations.md
deleted file mode 100644
index b6f5baf..0000000
--- a/operations.md
+++ /dev/null
@@ -1,53 +0,0 @@
-# Classifying and Specifying Tensor Operations
-
-- *n*-ary operations
-    - Do we count all tensors (operands, maybe including lone scalars?) on the RHS? Just tensors being "multiplied"? All but the LHS?
-    - How are `norm`, `reduce`, and `dot` classified?
-    - Is it enough to consider just "binary" operations?
-- Existing methods for specifying operations (contraction at least):
-    1. Reduction to canonical form. Example: specify input permutation of A and B and always contract last *k* indices of A and first *k* indices of B, then permute output into C.
-    
-        ```C[abcd] = A[afce]*B[edbf] -> perm_A = [0,2,3,1], perm_B = [0,3,2,1], perm_C = [0,2,1,3]```
-        
-        (Note that the permutations are not unique)
-        
-    2. Pick dimensions by position. Example: specify which dimensions are contracted listing their position in A and B. Dimensions of C are all those left over (some interfaces have no way to choose the ordering in C!). E.g. Tensor Toolbox and Eigen.
-    
-        ```C[abcd] = A[afce]*B[edbf] -> ctr_A = [3,1], ctr_B = [0,3], + explicit permutation of C```
-        
-    3. Pick dimensions using Einstein summation. Example: specify index strings for each operand. Repeated indices in A and B are contracted. Relative positions determine necessary permutations. I think this is everybody's current favorite.
-    
-        ```C[abcd] = A[afce]*B[edbf] -> idx_A = "afce", idx_B = "edbf", idx_C = "abcd"```
-        
-    4. Specify the shape of the operation not the shape of the operands. Example: generalize GEMM interface by specifying multiple *m*, *n*, and *k* dimensions and shapes, with two stride vectors for each operand. This is actually highly useful for the implementation!
-    
-        ```C++
-        contract(ndim_M, shape_M, ndim_N, shape_N, ndim_K, shape_K,
-                 alpha, A, stride_A_M, stride_A_K,
-                        B, stride_B_K, stride_B_N,
-                  beta, C, stride_C_M, stride_C_N);
-        ```
-- Permutations: "comes-from" or "goes-to" notation?
-- Beyond contraction:
-  - Generalized Einstein summation: sum over all indices that don't appear on the LHS. Allows: batched operations (weighting), single-index trace, extended contraction (CP decomposition etc.).
-  - What are these operations called? Possibility:
-    - Appears in only A or B but not both: trace.
-    - Appears in A and B but not C: contract.
-    - Appears in A or B (not both) and in C: free.
-    - Appears in only C: replicate (or broadcast).
-    - Appears in all 3: weight.
-    - More than 3 operands: who knows?
-- "Batched" operations:
-  - When batch operands are regularly strided in memory, it can be encoded as a higher-dimensional non-batched operation. True for batched matrix operations as well but it doesn't fit in the matrix-based interface.
-  - Otherwise (batch operands are randomly located), is it better to have a batched interface or more complicated tensor layouts? E.g. [TBLIS](https://github.com/devinamatthews/tblis) can do operations over tensors and matrices without regular strides (I call it a scatter layout).
-  - Why do we want batched operations?
-    - Block sparsity!
-    - Others?
-- Threading: should this be explicit in the interface
-- Mixed/extended/reduced precision and mixed-domain. Integers too.
-  - Internal computation in single precision may significant speed many things up without changing the calling program.
-  - Good complex/mixed domain support very important in many calculations.
-- Notation:
-    - free vs. uncontracted vs. external and bound vs. contracted vs. internal
-    - permutation vs. transpose vs. shuffle vs. etc.
-    - contraction vs. multiplication (should multiplication be something more general?)
diff --git a/proposal1.md b/proposal1.md
deleted file mode 100644
index 75d9696..0000000
--- a/proposal1.md
+++ /dev/null
@@ -1,105 +0,0 @@
-This is an initial proposal for a mixed-precision tensor contraction interface:
-
-```C++
-typedef char mode_type;
-typedef int64_t stride_type;
-typedef int64_t extent_type;
-```
-
-```C++
-enum error_t
-{
-   SUCCESS,
-   INVALID_ARGUMENTS,
-   INTERNAL_ERROR,
-   NOT_SUPPORTED
-};
-```
-
-```C++
-enum data_type_t
-{
-   TYPE_FP16,
-   TYPE_FP32,
-   TYPE_FP64,
-   TYPE_INT16,
-   TYPE_INT32,
-   TYPE_INT64,
-   TYPE_FCOMPLEX,
-   TYPE_DCOMPLEX
-};
-```
-
-```C++
-/**
- * \brief This routine computes the tensor contraction C = alpha * op(A) * op(B) + beta * op(C)
- *
- * \f[ \mathcal{C}_{\text{modes}_\mathcal{C}} \gets \alpha * op(\mathcal{A}_{\text{modes}_\mathcal{A}}) op(B_{\text{modes}_\mathcal{B}}) + \beta op(\mathcal{C}_{\text{modes}_\mathcal{C}}), \f]
- * where op(X) = X or op(X) = complex conjugate(X).
- *
- *
- * \param[in] alpha Scaling for A*B (data_type_t is determined by 'typeCompute')
- * \param[in] A Pointer to the data corresponding to A (data type is determined by 'typeA')
- * \param[in] typeA Datatype of A. This values could be TYPE_SINGLE, TYPE_DOUBLE, TYPE_COMPLEX, or TYPE_DOUBLE_COMPLEX
- * \param[in] conjA Indicates if the entries of A should be conjucated (only applies to complex types)
- * \param[in] nmodeA Number of modes of A
- * \param[in] extentA Array with 'nmodeA' values that represents the extent of A (e.g., extentA[] = {4,8,12} represents an order-3 tensor of size 4x8x12).
- * \param[in] strideA Array with 'nmodeA' values that represents the strides of
- *            A with respect to each mode. While the following inequality must be obeyed: 
- *               (strideA[i] == 0) or (strideA[i] >= s * extentA[i-1], if i > 0, where s
- *               represents the last strideA[j] that is larger than 0, with j < i) .
- *            strideA[i] == 0 indicates that this dimension will be broadcasted.
- *
- *            This argument is optional and may be NULL; in this case a compact
- *            tensor is assumed.
- * \param[in] modeA Array with 'nmodeA' values that represent the modes of A.
- * \param[in] B Pointer to the data corresponding to B (data type is determined by 'typeB')
- * \param[in] typeB Datatype of B (see typeA)
- * \param[in] conjB Indicates if the entries of B should be conjugated (only applies to complex types)
- * \param[in] nmodeB Number of modes of B
- * \param[in] extentB Array with 'nmodeB' values that represents the extent of B.
- * \param[in] strideB Array with 'nmodeB' values that represents the strides of B with respect to each mode (see strideA).
- * \param[in] beta Scaling for C (data_type_t is determined by 'typeCompute')
- * \param[in,out] C Pointer to the data corresponding to C (data type is determined by 'typeC')
- * \param[in] typeC Datatype of C (see typeA)
- * \param[in] conjC Indicates if the initial entries of C should be conjucated (only applies to complex types)
- * \param[in] nmodeC Number of modes of C
- * \param[in] extentC Array with 'nmodeC' values that represents the extent of C.
- * \param[in] strideC Array with 'nmodeC' values that represents the strides of C with respect to each mode (see strideA).
- * \param[in] typeCompute Datatype of for the intermediate computation of typeCompute T = A * B
- *
- *
- * Example:
- *
- * The tensor contraction C[a,b,c,d] = 1.3 * A[b,e,d,f] * B[f,e,a,c], 
- * where C, A, and B respectively are double-precision tensors of size E_a x E_b x E_c x E_d,
- * E_b x E_e x E_d x E_f, and E_f x E_e x E_a x E_c can be computed as follows:
- *
- * double alpha = 1.3;
- * double beta = 0.0;
- * extent_type extentC[] = {E_a, E_b, E_c, E_d};
- * extent_type extentA[] = {E_b, E_e, E_d, E_f};
- * extent_type extentB[] = {E_f, E_e, E_a, E_c};
- * stride_type strideC[] = {1, E_a, E_a*E_b, E_a*E_b*E_c}; //optional
- * stride_type strideA[] = {1, E_b, E_b*E_e, E_b*E_e*E_d}; //optional
- * stride_type strideB[] = {1, E_f, E_f*E_e, E_f*E_e*E_a}; //optional
- * mode_type modeC[] = {'a','b','c','d'};
- * mode_type modeA[] = {'b','e','d','f'};
- * mode_type modeB[] = {'f','e','a','c'};
- * int nmodeA = 4;
- * int nmodeB = 4;
- * int nmodeC = 4;
- * data_type_t typeA = TYPE_FP64;
- * data_type_t typeB = TYPE_FP64;
- * data_type_t typeC = TYPE_FP64;
- * data_type_t typeCompute = TYPE_FP64;
- *
- * error_t error = tensorMult(&alpha, A, typeA, false, nmodeA, extentA, NULL, modeA, 
- *                             B, typeB, false, nmodeB, extentB, NULL, modeB, 
- *                     &beta,  C, typeC, false, nmodeC, extentC, NULL, modeC, typeCompute);
- *
- */
-error_t tensorMult(const void* alpha, const void *A, data_type_t typeA, bool conjA, int nmodeA, const extent_type *extentA, const stride_type *strideA, const mode_type* modeA,
-                                      const void *B, data_type_t typeB, bool conjB, int nmodeB, const extent_type *extentB, const stride_type *strideB, const mode_type* modeB,
-                   const void* beta,        void *C, data_type_t typeC, bool conjC, int nmodeC, const extent_type *extentC, const stride_type *strideC, const mode_type* modeC, data_type_t typeCompute);
-```
diff --git a/proposal2a.md b/proposal2a.md
deleted file mode 100644
index 7469672..0000000
--- a/proposal2a.md
+++ /dev/null
@@ -1,96 +0,0 @@
-`XXX` is an appropiate namespace TBD.
-
-```C
-typedef int64_t XXX_extent;
-typedef int64_t XXX_stride;
-```
-These types should almost certainly be signed. 64-bit seems like a fair assumption these days.
-
-```C
-typedef int32_t XXX_index;
-```
-This can probably be just about any integral type.
-
-```C
-typedef enum
-{
-    XXX_TYPE_F32,
-    XXX_TYPE_F64,
-    XXX_TYPE_C32,
-    XXX_TYPE_C64,
-    ...
-} XXX_datatype;
-
-typedef enum
-{
-    XXX_TYPE_F32_F32_ACCUM_F32 = XXX_TYPE_F32,
-    ...
-} XXX_comp_datatype;
-```
-Enumerations for the supported storage and computational datatypes. Not all combinations are required to be supported.
-
-```C
-typedef /* unspecified */ XXX_error; // Should be a trivial type, e.g. "int"
-
-int XXX_error_check(XXX_error err); // return non-zero on error
-
-const char* XXX_error_explain(XXX_error err);
-
-void XXX_error_clear(XXX_error err);
-```
-Error handling --- implementation defined.
-
-```C
-typedef /* unspecified */ XXX_attr; // Requires initialization. E.g. "struct XXX_attr_internal*"
-typedef int32_t XXX_key; // Some values should be reserved for standardization
-
-XXX_error XXX_attr_init(XXX_attr* attr);
-
-XXX_error XXX_attr_destroy(XXX_attr* attr);
-
-XXX_error XXX_attr_set(XXX_attr* attr, XXX_key, void* value);
-
-XXX_error XXX_attr_get(XXX_attr* attr, XXX_key, void** value);
-
-XXX_error XXX_attr_clear(XXX_attr* attr, XXX_key);
-```
-Implementation defined (and maybe some standard) attributes, loosely based on MPI.
-
-```C
-// Unary and binary element-wise operations (transpose, scale, norm, reduction, etc.) should also be defined!
-
-// Compute D_{idx_D} = alpha * A_{idx_A} * B_{idx_B} + beta * C_{idx_C}
-
-XXX_error
-XXX_contract(const void*             alpha,
-                   XXX_datatype      type_alpha,
-             const void*             A,
-                   XXX_datatype      type_A,
-                   int               nmode_A,
-             const XXX_extent*       shape_A,
-             const XXX_stride*       stride_A,
-             const XXX_index*        idx_A,
-             const void*             B,
-                   XXX_datatype      type_B,
-                   int               nmode_B,
-             const XXX_extent*       shape_B,
-             const XXX_stride*       stride_B,
-             const XXX_index*        idx_B,
-             const void*             beta,
-                   XXX_datatype      type_beta,
-             const void*             C,
-                   XXX_datatype      type_C,
-                   int               nmode_C,
-             const XXX_extent*       shape_C,
-             const XXX_stride*       stride_C,
-             const XXX_index*        idx_C,
-                   void*             D,
-                   XXX_datatype      type_D,
-                   int               nmode_D,
-             const XXX_extent*       shape_D,
-             const XXX_stride*       stride_D,
-             const XXX_index*        idx_D,
-                   XXX_comp_datatype comp_type,
-                   XXX_attr          attr);
-```
-
diff --git a/proposal2b.md b/proposal2b.md
deleted file mode 100644
index da7a501..0000000
--- a/proposal2b.md
+++ /dev/null
@@ -1,46 +0,0 @@
-See [Proposal 1a](proposal1a.md) for definitions of basic types. This "very-low-level" interface and Proposal 1a could coexist.
-
-```C
-// Compute D_{MNL} = alpha * \sum_K A_{MKL} B_{KNL} + beta * C_{MNL}
-
-XXX_error
-XXX_contract(      int               nmode_M,
-             const XXX_extent*       shape_M,
-                   int               nmode_N,
-             const XXX_extent*       shape_N,
-                   int               nmode_K,
-             const XXX_extent*       shape_K,
-                   int               nmode_L,
-             const XXX_extent*       shape_L,
-             const void*             alpha,
-                   XXX_datatype      type_alpha,
-             const void*             A,
-                   XXX_datatype      type_A,
-             const XXX_stride*       stride_A_M,
-             const XXX_stride*       stride_A_K,
-             const XXX_stride*       stride_A_L,
-             const void*             B,
-                   XXX_datatype      type_B,
-             const XXX_stride*       stride_B_K,
-             const XXX_stride*       stride_B_N,
-             const XXX_stride*       stride_B_L,
-             const void*             beta,
-                   XXX_datatype      type_beta,
-             const void*             C,
-                   XXX_datatype      type_C,
-             const XXX_stride*       stride_C_M,
-             const XXX_stride*       stride_C_N,
-             const XXX_stride*       stride_C_L,
-                   void*             D,
-                   XXX_datatype      type_D,
-             const XXX_stride*       stride_D_M,
-             const XXX_stride*       stride_D_N,
-             const XXX_stride*       stride_D_L,
-                   XXX_comp_datatype type_comp,
-                   XXX_attr          attr);
-
-// Batched tensor contraction (TBD)
-
-XXX_error
-XXX_contract_batch(  ????  );
-```
diff --git a/src/tapp.h b/src/tapp.h
index d568e14..5f4ae22 100644
--- a/src/tapp.h
+++ b/src/tapp.h
@@ -1,13 +1,13 @@
 #ifndef TAPP_TAPP_H_
 #define TAPP_TAPP_H_
 
-#include "tapp/error.h"
-#include "tapp/attributes.h"
-#include "tapp/datatype.h"
-#include "tapp/handle.h"
-#include "tapp/executor.h"
-#include "tapp/status.h"
-#include "tapp/tensor.h"
-#include "tapp/product.h"
+#include "tapp/tapp_error.h"
+#include "tapp/tapp_attributes.h"
+#include "tapp/tapp_datatype.h"
+#include "tapp/tapp_handle.h"
+#include "tapp/tapp_executor.h"
+#include "tapp/tapp_status.h"
+#include "tapp/tapp_tensor.h"
+#include "tapp/tapp_product.h"
 
 #endif /* TAPP_TAPP_H_ */
diff --git a/src/tapp/attributes.h b/src/tapp/tapp_attributes.h
similarity index 94%
rename from src/tapp/attributes.h
rename to src/tapp/tapp_attributes.h
index 679bd8b..4b968a4 100644
--- a/src/tapp/attributes.h
+++ b/src/tapp/tapp_attributes.h
@@ -3,7 +3,7 @@
 
 #include <stdint.h>
 
-#include "error.h"
+#include "tapp_error.h"
 
 typedef intptr_t TAPP_attr;
 typedef int TAPP_key;
diff --git a/src/tapp/datatype.h b/src/tapp/tapp_datatype.h
similarity index 100%
rename from src/tapp/datatype.h
rename to src/tapp/tapp_datatype.h
diff --git a/src/tapp/error.h b/src/tapp/tapp_error.h
similarity index 100%
rename from src/tapp/error.h
rename to src/tapp/tapp_error.h
diff --git a/src/tapp/executor.h b/src/tapp/tapp_executor.h
similarity index 94%
rename from src/tapp/executor.h
rename to src/tapp/tapp_executor.h
index debe17c..62bcf1a 100644
--- a/src/tapp/executor.h
+++ b/src/tapp/tapp_executor.h
@@ -3,7 +3,7 @@
 
 #include <stdint.h>
 
-#include "error.h"
+#include "tapp_error.h"
 
 typedef intptr_t TAPP_executor;
 
diff --git a/src/tapp/handle.h b/src/tapp/tapp_handle.h
similarity index 95%
rename from src/tapp/handle.h
rename to src/tapp/tapp_handle.h
index 65d67d5..eb32db7 100644
--- a/src/tapp/handle.h
+++ b/src/tapp/tapp_handle.h
@@ -3,7 +3,7 @@
 
 #include <stdint.h>
 
-#include "error.h"
+#include "tapp_error.h"
 
 typedef intptr_t TAPP_handle;
 
diff --git a/src/tapp/product.h b/src/tapp/tapp_product.h
similarity index 92%
rename from src/tapp/product.h
rename to src/tapp/tapp_product.h
index 41caaa3..6302216 100644
--- a/src/tapp/product.h
+++ b/src/tapp/tapp_product.h
@@ -3,12 +3,12 @@
 
 #include <stdint.h>
 
-#include "error.h"
-#include "handle.h"
-#include "executor.h"
-#include "datatype.h"
-#include "status.h"
-#include "tensor.h"
+#include "tapp_error.h"
+#include "tapp_handle.h"
+#include "tapp_executor.h"
+#include "tapp_datatype.h"
+#include "tapp_status.h"
+#include "tapp_tensor.h"
 
 //TODO: where should this go?
 typedef int TAPP_element_op;
@@ -26,10 +26,10 @@ enum
  * TODO: what are the required error conditions?
  *
  * TODO: must C and D info be the same? (should they just be the same variable?)
- *  JB: Can this be implemented efficiently with different data types of C and D? 
+ *  JB: Can this be implemented efficiently with different data types of C and D?
  *      Let’s say D is complex and C real. Then it should be possible with a different "stride".
- *      In such cases we might want to support different C and D info. If D info is null, they 
- *      are assumed identical. 
+ *      In such cases we might want to support different C and D info. If D info is null, they
+ *      are assumed identical.
  */
 
 typedef intptr_t TAPP_tensor_product;
diff --git a/src/tapp/status.h b/src/tapp/tapp_status.h
similarity index 94%
rename from src/tapp/status.h
rename to src/tapp/tapp_status.h
index 7cfd690..5be2776 100644
--- a/src/tapp/status.h
+++ b/src/tapp/tapp_status.h
@@ -3,7 +3,7 @@
 
 #include <stdint.h>
 
-#include "error.h"
+#include "tapp_error.h"
 
 typedef intptr_t TAPP_status;
 
diff --git a/src/tapp/tensor.h b/src/tapp/tapp_tensor.h
similarity index 96%
rename from src/tapp/tensor.h
rename to src/tapp/tapp_tensor.h
index ef405e3..79b0c74 100644
--- a/src/tapp/tensor.h
+++ b/src/tapp/tapp_tensor.h
@@ -3,8 +3,8 @@
 
 #include <stdint.h>
 
-#include "error.h"
-#include "datatype.h"
+#include "tapp_error.h"
+#include "tapp_datatype.h"
 
 typedef intptr_t TAPP_tensor_info;
 
diff --git a/terminology.md b/terminology.md
deleted file mode 100644
index a72553f..0000000
--- a/terminology.md
+++ /dev/null
@@ -1,81 +0,0 @@
-# Terminology
-
-## Tensor
-
-A **dense tensor** is a multi-dimensional array of arithmetic values. A tensor has a positive number, *n*, of **modes**  (i.e. it is an *n*-mode tensor).<sup>[1](#foot1)</sup> Each **mode** has a non-negative **extent**<sup>[2](#foot2)</sup>, which is the number of distinct values that the mode can have. As an example, consider a 4-mode tensor *A* with real elements and extents of 4, 5, 9, and 3:
-
-![](https://latex.codecogs.com/gif.latex?\mathcal{A}\in{}\mathbb{R}^{4\otimes{}5\otimes9{}\otimes{}3})
-
-In a programming language such as C, this would correspond to:
-
-```C
-double A[4][5][9][3];
-```
-
-(or `float`, etc.).
-
-<a name="foot1">1</a>) Also: *n*-dimensional, *n*-way, order-*n*, *n*-ary, rank-*n*, *n*-adic, *n*-fold, *n*-index, *n* subspaces.
-
-<a name="foot2">2</a>) Also: length, dimension, size.
-
-## Indexing
-
-Individual tensor elements are referred to by **indexing**. A *d*-mode tensor is indexed by *d* **indices**.<sup>[3](#foot3)</sup> Each index may take on a definite integral value in the range `[0,n)`,<sup>[4](#foot4)</sup> where *n* is the extent of the mode being indexed. If an index appears multiple times then it takes on the same value in each case. For example, we may refer to elements of the tensor *A* above using indices *i*, *j*, *k*, and *l*:
-
-![](https://latex.codecogs.com/gif.latex?\mathcal{A}_{ijkl}\in{}\mathbb{R})
-
-In this case it must be that `0 <= i < 4`, `0 <= j < 5`, `0 <= k < 9`, and `0 <= l < 3`.
-
-<a name="foot3">3</a>) Also: labels, symbols.
-
-<a name="foot4">4</a>) 0-based (C-style) indexing is used here, but 1-based index (Fortran- or Matlab-style) is also possible. The distinction between the two is only relevant when referencing a single element of sub-range of elements. Operations such as tensor contraction do not generally need to explicitly depend on any indexing method.
-
-## Shape
-
-A tensor **shape**<sup>[5](#foot5)</sup> is an ordered set of non-negative integers. The *i*th entry, `shape[i]`, gives the extent of the *i*th mode of the tensor. The shape of a particular tensor *A* is denoted `shape_A`. Mode *i* in tensor *A* is **compatible** with mode *j* in tensor *B* if `shape_A[i] == shape_B[j]`. Only compatible dimensions may share the same index.
-
-<a name="foot5">5</a>) Also: size, structure.
-
-## Layout
-
-When the values of a tensor are placed in a linear storage medium (e.g. main memory), additional information is necessary to map values referred to by indices to linear locations. A tensor **layout** for a *d*-mode tensor is such a map from *d* indices to a linear location:
-
-![](https://latex.codecogs.com/gif.latex?\mathrm{layout}\\,\colon\mathbb{N}^d\to\mathbb{N})
-
-The most useful layout for dense tensors is the **general stride** layout:
-
-![](https://latex.codecogs.com/gif.latex?\mathrm{layout}\\,\colon(i_0,\ldots,i_{d-1})\mapsto\sum_{k=0}^{d-1}i_k\cdot{s_k})
-
-The ordered set of *s* values is the **stride**<sup>[6](#foot6)</sup> of the tensor, denoted for some tensor *A* by `stride_A`. In general a stride value may be negative, but the strides must obey the condition that no two elements with distinct indices share the same linear location. There are two special cases of the general stride layout of importance: the **column-major** layout and the **row-major** layout. These are defined by:
-
-![](https://latex.codecogs.com/gif.latex?\begin{align*}\mathrm{layout_{col}}\\,\colon(i_0,\ldots,i_{d-1})\mapsto{}&\sum_{k=0}^{d-1}i_k\prod_{l=0}^{k-1}n_l\\\\{}\mathrm{layout_{row}}\\,\colon(i_0,\ldots,i_{d-1})\mapsto{}&\sum_{k=0}^{d-1}i_k\prod_{l=k+1}^{d-1}n_l\end{align*})
-
-where *n* are the extents of the tensor. Since these layouts depend only on the tensor shape, an additional stride vector is not required.
-
-<a name="foot6">6</a>) Also: shape, leading dimension (similar but not identical idea).
-
-## Tensor Specification
-
-A tensor *A* is fully specified by:
-
-1. Its number of modes `nmode_A`.
-2. Its shape `shape_A`.
-3. Its layout (restricting ourselves to general stride layouts) `stride_A`, or an assumption of column- or row-major storage.
-4. A pointer to the origin element `A`. This is the location of ![](https://latex.codecogs.com/gif.latex?\mathcal{A}_{0\ldots{}0}).
-5. Its data type (real, complex, precision, etc.), unless assumed from the context.
-
-## Other Terms
-
-The **size**<sup>[7](#foot7)</sup> of a tensor is the number of elements, given by the product of the extents. The **span**<sup>[8](#foot8)</sup> of a tensor is the difference between the lowest and highest linear location over all elements, plus one. In a general stride layout this is given by: 
-
-![](https://latex.codecogs.com/gif.latex?\mathrm{span}=1+\sum_{i=0}^{d-1}(n_i-1)\cdot|s_i|)
-
-A tensor is **contiguous** if its span is equal to its size. The *i*th and *j*th modes are **sequentially contiguous** if `stride[j] == stride[i]*shape[i]` (note this requires `stride[j] >= stride[i]` and means that sequential contiguity is not commutative). An ordered set of modes is sequentially contiguous if each consecutive pair of modes is sequentially contiguous.
-
-A sequentially contiguous set of modes may be replaced (**folded**<sup>[9](#foot9)</sup>) by a single mode whose extent is equal to the product of the extents of the original modes, and whose stride is equal to the smallest stride of the original modes. The tensor formed from folding contains the same elements at the same linear locations as the original. A contiguous tensor may always be folded to a vector (or in general to a tensor with any fewer number of modes).
-
-<a name="foot7">7</a>) Also: length.
-
-<a name="foot8">8</a>) Also: extent.
-
-<a name="foot9">9</a>) Also: linearized, unfolded (confusingly).