THE COLLECTED WORKS OF J. WILLARD GIBBS, Ph.D., LL.D.LONGMANS, GREEN AND CO. 55 FIFTH AVENUE, NEW YORK 221 EAST 20th STREET, CHICAGO TREMONT TEMPLE, BOSTON 210 VICTORIA STREET, TORONTO LONGMANS, GREEN AND CO. Ltd. 39 PATERNOSTER ROW, E C 4, LONDON 53 NICOL ROAD, BOMBAY 6 OLD COURT HOUSE STREET, CALCUTTA 167 MOUNT ROAD, MADRASTHE COLLECTED WORKS OF J. WILLARD GIBBS, Ph.D., LL.D. FORM ERLY PROFESSOR OF MATHEMATICAL PHYSICS IN YALE UNIVERSITY IF TWO VOLUMES VOLUME II PART ONE ELEMENTARY PRINCIPLES IN STATISTICAL MECHANICS PART TWO DYNAMICS VECTOR ANALYSIS AND MULTIPLE ALGEBRA ELECTROMAGNETIC THEORY OF LIGHT ETC. LONGMANS, GREEN AND CO. NEW YORK • LONDON • TORONTO 1928COPYRIGHT* I9O2 BY YAJLE UNIVERSITY Permission for the reprint of the different papers contained in these volumes has in every ease been obtained from the proper authondties. FRINtED IN THE UNITED STATES OF AMERICAPART ONE ELEMENTARY PRINCIPLES : IN STATISTICAL MECHANICS DEVELOPED WITH ESPECIAL REFERENCE TO THE RATIONAL FOUNDATION OF THERMODYNAMICSCOPYRIGHT • I902 BY YALE UNIVERSITYPREFACE. The usual point of view in die study of mechanics is that where the attention is mainly directed to the changes which take place in the course of time in a given system. The principal problem is the determination of the condition of the system with respect to configuration and velocities at any required time, when its condition in these respects has been given for some one time, and the fundamental equations are those which express the changes continually taking place in the system. Inquiries of this kind are often simplified by taking into consideration conditions of the system other than those through which it actually passes or is supposed to pass, but our attention is not usually carried beyond conditions differing infinitesimally from those which are regarded as actual. For some purposes, however, it is desirable to take a broader view of the subject. We may imagine a great number of systems of the same nature, but differing in the configurations and velocities which they have at a given instant, and differing not merely infinitesimally, but it may be so as to embrace every conceivable combination of configuration and velocities. And here we may set the problem, not to follow a particular system through its succession of configurations, but to determine how the whole number of systems will be distributed among the various conceivable configurations and velocities at any required time, when the distribution has been given for some one time. The fundamental equation for this inquiry is that which gives the rate of change of the number of systems which fall within any infinitesimal limits of configuration and velocity.Yin PREFACE. Such inquiries have been called by Maxwell statistical. They belong to a branch of mechanics which owes its origin to the desire to explain the laws of thermodynamics on mechanical principles, and of which Clausius, Maxwell, and Boltzmann are to be regarded as the principal founders. The first inquiries in this field were indeed somewhat narrower in their scope than that which has been mentioned, being applied to the particles of a system, rather than to independent systems. Statistical inquiries were next directed to the phases (or conditions with respect to configuration and velocity) which succeed one another in a given system in the course of time. The explicit consideration of a great number of systems and their distribution in phase, and of the permanence or alteration of this distribution in the course of time is perhaps first found in Boltzmann’s paper on the “ Zusammenhang zwischen den Slitzen fiber das Verhalten mehratomiger Gasmolekfile mit Jacobi’s Princip des letzten Multiplicators ” (1871). But although, as a matter of history, statistical mechanics owes its origin to investigations in thermodynamics, it seems eminently worthy of an independent development, both on account of the elegance and simplicity of its principles, and because it yields new results and places old truths in a new light in departments quite outside of thermodynamics. Moreover, the separate study of this branch of mechanics seems to afford the best foundation for the study of rational thermodynamics and molecular mechanics. The laws of thermodynamics, as empirically determined, express the approximate and probable behavior of systems of a great number of particles, or, more precisely, they express the laws of mechanics for such systems as they appear to beings who have not the fineness of perception to enable them to appreciate quantities of the order of magnitude of those which relate to single particles, and who cannot repeat their experiments often enough to obtain any but the most probable results. The laws of statistical mechanics apply to conservative systems of any number of degrees of freedom,PREFACE. ix and are exact. This does not make them more difficult to establish than the approximate laws for systems of a great many degrees of freedom, or for limited classes of such systems. The reverse is rather the case, for our attention is not diverted from what is essential by the peculiarities of the system considered, and we are not obliged to satisfy ourselves that the effect of the quantities and circumstances neglected will be negligible in the result. The laws of thermodynamics may be easily obtained from the principles of statistical mechanics, of which they are the incomplete expression* but they make a somewhat blind guide in our search for those laws. This is perhaps the principal cause of the slow progress of rational thermodynamics, as contrasted with the rapid deduction of the consequences of its laws as empirically established. To this must be added that the rational foundation of thermodynamics lay in a branch of mechanics of which the fundamental notions and principles, and the characteristic operations, were alike unfamiliar to students of mechanics. We may therefore confidently believe that nothing will more conduce to the clear apprehension of the relation of thermodynamics to rational mechanics, and to the interpretation of observed phenomena with reference to their evidence respecting the molecular constitution of bodies, than the study of the fundamental notions and principles of that department of mechanics to which thermodynamics* is especially related. Moreover, we avoid the gravest difficulties when, giving up the attempt to frame hypotheses concerning the constitution of material bodies, we pursue statistical inquiries as a branch of rational mechanics. In the present state of science, it seems hardly possible to frame a dynamic theory of molecular action which shall embrace the phenomena of thermodynamics, of radiation, and of the electrical manifestations which accompany the union of atoms. Yet any theory is obviously inadequate which does not take account of all these phenomena. Even if we confine our attention to theX PREFACE. phenomena distinctively thermodynamic, we do not escape difficulties in as simple a matter as the number of degrees of freedom of a diatomic gas. It is well known that while theory would assign to the gas six degrees of freedom per molecule, in our experiments on specific heat we cannot account for more than five. Certainly, one is building on an insecure foundation, who rests his work on hypotheses concerning the constitution of matter. Difficulties of this kind have deterred the author from attempting to explain the mysteries of nature, and have forced him to be contented with the more modest aim of deducing some of the more obvious propositions relating to the statistical branch of mechanics. Here, there can be no mistake in regard to the agreement of the hypotheses with the facts of nature, for nothing is assumed in that respect. The only error into which one can fall, is the want of agreement between the premises and the conclusions, and this, with care, one may hope, in the main, to avoid. The matter of the present volume consists in large measure of results which have been obtained by the investigators mentioned above, although thè point of view and the arrangement may be different. These results, given to the public one by one in the order of their discovery, have necessarily, in their original presentation, not been arranged in the most logical manner. In the first chapter we consider the general problem which has been mentioned, and find what may be called the fundamental equation of statistical mechanics. A particular case of this equation will give the condition of statistical equilibrium, i. g., the condition which the distribution of the systems in phase must satisfy in order that the distribution shall be permanent. In the general case, the fundamental equation admits an integration, which gives a principle which may be variously expressed, according to the point of view from which it is regarded, as the conservation of density-inphase, or of extension-in-phase, or of probability of phase.PREFACE. XI In the second chapter, we apply this principle of conservation of probability of phase to the theory of errors in the calculated phases of a system, when the determination of the arbitrary constants of the integral equations are subject to error. In this application, we do not go beyond the usual approximations. In other words, we combine the principle of conservation of probability of phase, which is exact, with those approximate relations, which it is customary to assume in the “ theory of errors.” In the third chapter we apply the principle of conservation of extension-in-phase to the integration of the differential equations of motion. This gives Jacobi’s “last multiplier,” as has been shown by Boltzmann. In the fourth and following chapters we return to the consideration of statistical equilibrium, and confine our attention to conservative systems. We consider especially ensembles of systems in which the index (or logarithm) of probability of phase is a linear function of the energy. This distribution, on account of its unique importance in the theory of statistical equilibrium, I have ventured to call canonical, and the divisor of the energy, the modulus of distribution. The moduli of ensembles have properties analogous to temperature, in that equality of the moduli is a condition of equilibrium with respect to exchange of energy, when such exchange is made possible. We find a differential equation relating to average values in the ensemble which is identical in form with the fundamental differential equation of thermodynamics, the average index of probability of phase, with change of sign, corresponding to entropy, and the modulus to temperature For the average, square of the anomalies of the energy, we find an expression which vanishes in comparison with the square of the average energy, when the number of degrees of freedom is indefinitely increased. An ensemble of systems in which the number of degrees of freedom is of the same order of magnitude as the number of molecules in the bodiesxii PMEFACE. with which we experiment, if distributed canonically, would therefore appear to human observation as an ensemble of systems in which all have the same energy. We meet with other quantities, in the development of the subject, which, when the number of degrees of freedom is very great, coincide sensibly with the modulus, and with the average index of probability, taken negatively, in a canonical ensemble, and which, therefore, may also be regarded as corresponding to temperature and entropy. The correspondence is however imperfect, when the number of degrees of freedom is not very great, and there is nothing to recommend these quantities except that in definition they may be regarded as more simple than those which have been mentioned. In Chapter XIV, this subject of thermodynamic analogies is discussed somewhat at length. Finally, in Chapter XY, we consider the modification of the preceding results which is necessary when we consider systems composed of a number of entirely similar particles, or, it may be, of a number of particles of several kinds, all of each kind being entirely similar to each other, and when one of the variations to be considered is that of the numbers of the particles of the various kinds which are contained in a system. This supposition would naturally have been introduced earlier, if our object had been simply the expression of the laws of nature. It seemed desirable, however, to separate sharply the purely thermodynamic laws from those special modifications which belong rather to the theory of the properties of matter. J. W. G. New Haven, December, 1901.CONTENTS OF PART ONE CHAPTER I. GENERAL NOTIONS. THE PRINCIPLE OF CONSERVATION OF EXTENSION-IN-PHASE Page Hamilton’s equations of motion........... ..............3-5 Ensemble of systems distributed in phase...................5 Extension-in-phase, density-in-phase..........................6 Fundamental equation of statistical mechanics..............6-8 Condition of statistical equilibrium ......................8 Principle of conservation of density-in-phase..............9 Principle of conservation of extension-in-phase..............10 Analogy in hydrodynamics ....................................11 Extension-in-phase is an invariant........................11-13 Dimensions of extension-in-phase.............................13 Various analytical expressions of the principle ........13-15 Coefficient and index of probability of phase..............16 Principle of conservation of probability of phase.......17, 18 Dimensions of coefficient of probability of phase............19 CHAPTER II. APPLICATION OF THE PRINCIPLE OF CONSERVATION OF EXTENSION-IN-PHASE TO THE THEORY OF ERRORS. Approximate expression for the index of probability of phase . 20, 21 Application of the principle of conservation of probability of phase to the constants of this expression....................21-25 CHAPTER III. APPLICATION OF THE PRINCIPLE OF CONSERVATION OF EXTENSION-IN-PHASE TO THE INTEGRATION OF THE DIFFERENTIAL EQUATIONS OF MOTION. Case in which the forces are function of the coordinates alone . 26-29 Case in which the forces are functions of the coordinates with the time.....................................................30, 31XIV CONTENTS. CHAPTER IV. ON THE DISTRIBUTION-IN-PHASE CALLED CANONICAL, IN WHICH THE INDEX OF PROBABILITY IS A LINEAR FUNCTION OF THE ENERGY. Page Condition of statistical equilibrium ..........................32 Other conditions which the coefficient of probability must satisfy . 33 Canonical distribution — Modulus of distribution...............34 %/s must be finite ............................................35 The modulus of the canonical distribution has properties analogous to temperature............................................ 35-37 Other distributions have similar properties....................37 Distribution in which the index of probability is a linear function of the energy and of the moments of momentum about three axes . 38, 39 Case in which the forces are linear functions of the displacements, and the index is a linear function of the separate energies relating to the normal types of motion...............................39-41 Differential equation relating to average values in a canonical ensemble ...................................................42-44 This is identical in form with the fundamental differential equation of thermodynamics...........................................44, 45 CHAPTER V. AVERAGE VALUES IN A CANONICAL ENSEMBLE OF SYSTEMS. Case of v material points. Average value of kinetic energy of a single point for a given configuration or for the whole ensemble -f 0.................................................46, 47 Average value of total kinetic energy for any given configuration or for the whole ensemble = f i/ 0......................47 System of n degrees of freedom. Average value of kinetic energy, for any given configuration or for the whole ensemble = | 0 . 48-50 Second proof of the same proposition...................50-52 Distribution of canonical ensemble in configuration......52-54 Ensembles canonically distributed in configuration..........55 Ensembles canonically distributed in velocity...............56 CHAPTER VI. EXTENSION-IN-CONFIGURATION AND EXTENSION-INVELOCITY. Extension-in-configuration and extension-in-velocity are invariants ............................................. 57-59CONTENTS. XV Page Dimensions of these quantities.......................................60 Index and coefficient of probability of configuration................61 Index and coefficient of probability of velocity.....................62 Dimensions of tnese coefficients.....................................63 Relation between extension-in-configuration and extension-in-veiocity 64 Definitions of extension-in-phase, extension-in-configuration, and extension-in-velocity, without explicit mention of coordinates . . 65-67 CHAPTER VII. FARTHER DISCUSSION OF AVERAGES IN A CANONICAL ENSEMBLE OF SYSTEMS. Second and third differential equations relating to average values iu a canonical ensemble................................68, 69 These are identical in form with thermodynamic equations enunciated by Clausius........................................69 Average square of the anomaly of the energy — of the kinetic enr ergy — of the potential energy............................ 70-72 These anomalies are insensible to human observation and experience when the number of degrees of freedom of the system is very great.................................. 73, 74 Average values of powers of the energies..................75-77 Average values of powers of the anomalies of the energies . . 77-80 Average values relating to forces exerted on external bodies . , 80-83 General formulae relating to averages in a canonical ensemble . 83-86 CHAPTER VIII. ON CERTAIN IMPORTANT FUNCTIONS OF THE ENERGIES OF A SYSTEM. Definitions. V = extension-in-phase below a limiting energy (e). (/> — log dVjde ................................... 87, 88 Vq — extension-in-configuration below a limiting value of the potential energy (e?). q,= log dVg / dcg.........................89 > 99 Vp = extension-in-velocity below a limiting value of thè kinetic energy (%)• p = log dVp/de„ ........... . 90,91 Evaluation of Vp and 96 Approximate formulae for large values of n . . . . . • .97,98 Calculation of V or for whole system when given for parts . . . 98 Geometrical illustration................................................99XVI CONTENTS. CHAPTER IX. THE FUNCTION

2, the most probable value of the energy in a canonical ensemble is determined by d(j> / d* =* 1 / 0..........100,101 When n > 2, the average value of dl de in a canonical ensemble is 1/0...................................................101 When n is large, the value of <£ corresponding to d I de—I/O (<£o) is nearly equivalent (except for an additive constant) to the average index of probability taken negatively (— rj) . . 101-104 Approximate formulae for a+ rj when n is large........104-106 When n is large, the distribution of a canonical ensemble in energy follows approximately the law of errors..................105 This is not peculiar to the canonical distribution.......107, 108 Averages in a canonical ensemble........................ . 108-114 CHAPTER X. ON A DISTRIBUTION IN PHASE CALLED MICROCANONI-CAL IN WHICH ALL THE SYSTEMS HAVE THE SAME ENERGY. The microcanonical distribution defined as the limiting distribution obtained by various processes . . -............- . .115,116 Average values in the microcanonical ensemble of functions of the kinetic and potential energies.........................117-120 If two quantities have the same average values in every microcanonical ensemble, they have the same average value in every canonical ensemble...............................................120 Average values in the microcanonical ensemble of functions of the energies of parts of the system....................... 121-123 Average values of functions of the kinetic energy of a part of the system.................................................123, 124 Average values of the external forces in a microcanonical ensemble. Differential equation relating to these averages, having the form of the fundamental differential equation of thermodynamics . 124-128 CHAPTER XI. MAXIMUM AND MINIMUM PROPERTIES OF VARIOUS DISTRIBUTIONS IN PHASE. Theorems I-VI. Minimum properties of certain distributions . 129-133 Theorem VII. The average index of the whole system compared with the sum of the average indices of the parts.133-135CONTENTS. XVII Theorem VIII. The average index of the whole ensemble com- pared with the average indices of parts of the ensemble . , 135-137 Theorem IX. Effect on the average index of making the distribu-tion-in-phase uniform within any limits........................137-138 CHAPTER XII. ON THE MOTION OF SYSTEMS AND ENSEMBLES OF SYSTEMS THROUGH LONG PERIODS OF TIME. Under what conditions, and with what limitations, may we assume that a system will return in the course of time to its original phase, at least to any required degree of approximation? . . 139-142 Tendency in an ensemble of isolated systems toward a state of statistical equilibrium.......................................143-151 CHAPTER XIII. EFFECT OF VARIOUS PROCESSES ON AN ENSEMBLE OF SYSTEMS. Variation of the external coordinates can only cause a decrease in the average index of probability......................152-154 This decrease may in general be diminished by diminishing the rapidity of the change in the external coordinates .... 154-157 The mutual action of two ensembles can only diminish the sum of their average indices of probability..................158,159 In the mutual action of two ensembles which are canonically distributed, that which has the greater modulus will lose energy . 160 Repeated action between any ensemble and others which are canonically distributed with the same modulus will tend to distribute the first-mentioned ensemble canonically with the same modulus 161 Process analogous to a Carnot’s cycle.....................162,163 Analogous processes in thermodynamics . . . . . . . . 163,164 CHAPTER XIY. DISCUSSION OF THERMODYNAMIC ANALOGIES. The finding in rational mechanics an a priori foundation for thermodynamics requires mechanical definitions of temperature and entropy. Conditions which the quantities thus defined must satisfy..........................................165-167 The modulus of a canonical ensemble (0), and the average index of probability taken negatively (rj), as analogues of temperature and entropy.......................*................167-169XY111 CONTENTS. The functions of the energy de / d log V and log V as analogues of temperature and entropy......................................169-172 The functions of the energy deld

These equations will hold for any forces whatever. If the forces are conservative, in other words, if the expression (1) is an exact differential, we may set deq dqx’ etc., (4) where eq is a function of the coordinates which we shall call the potential energy of the system. If we write e for the total energy, we shall have e = eP + eg, aiid equations (3) may be written • de • __ de Pl dqt’ etc. (5) (6) The potential energy (eff) may depend on other variables beside the coordinates qx... qn. We shall often suppose it to depend in part on coordinates of external bodies, which we shall denote by a19 a2, etc. We shall then have for the complete value of the differential of the potential energy * deq = — F1dq1 . . — Fn dqn — Ax dax — A2 da2— etc., (7) where A19 A2, etc., represent forces (in the generalized sense) exerted by the system on external bodies. For the total energy (e) we shall have de=q1dp1. . . + qn dpn — px dqx . . . — pn dqn — Ax dax — A2 da2 — etc. (8) It will be observed that the kinetic energy (e^) in the most general case is a quadratic function of the p's (or q's) * It will be observed, that although we call the potential energy of the system which we are considering, it is really so defined as to include that energy which might be described as mutual to that system and external bodies.ENSEMBLE OF SYSTEMS. 5 involving also the q9 s bnt not the a’s; that the potential energy, when it exists, is function of the q’s and a9s; and that the total energy, when it exists, is function of the p9s (or q9s), the q9s, and the a’s. In expressions like dejdqly then’s, and not the q9s, are to be taken as independent variables, as has already been stated with respect to the kinetic energy. Let us imagine a great number of independent systems, identical in nature, but differing in phase, that is, in their condition with respect to configuration and velocity. The forces are supposed to be determined for every system by the same law, being functions of the coordinates of the system qx, ... qn, either alone or with the coordinates ax, a2, etc. of certain external bodies. It is not necessary that they should be derivable from a force-function. The external coordinates a-p a2, etc. may vary with the time, but at any given time have fixed values. In this they differ from the internal coordinates qx^. *. qn, which at the same time have different values in the different systems considered. Let us especially consider the number of systems which at a given instant fall within specified limits of phase, viz., those for which Pi qn 2» ^ ? the accented letters denoting constants. We shall suppose the differences px — px\ q” — etc. to be infinitesimal, and that the systems are distributed in phase in some continuous manner,* so that the number having phases within the limits specified may be represented by V (pi' - Pi) • • • (Pn" ~ Pi) (qi" - qi) • • . (qn" - qi), (10) * In strictness, a finite number of systems cannot be distributed continuously in phase. But by increasing indefinitely the number of systems, we may approximate to a continuous law of distribution, such as is here described. To avoid tedious circumlocution, language like the above may be allowed, although wanting in precision of expression, when the sense in which it is to be taken appears sufficiently clear.6 VARIATION OF THE or more briefly by Edp1. .. dpn dqx. . . dqnf (11) where D is a function of the jp’s and q’s and in general of t also, for as time goes on, and the individual systems change their phases, the distribution of the ensemble in phase will in general vary. In special cases, the distribution in phase will remain unchanged. These are cases of statistical equilibrium. If we regard all possible phases as forming a sort 61 extension of 2 n dimensions, we may regard the product of differentials in (11) as expressing an element of this extension, and D as expressing the density of the systems in that element. We shall call the product dpx... dpn dq\ . . . dqn (12) an element of extension-in-phase, and D the density-in-phase of the systems. It is evident that the changes which take place in the density of the systems in any given element of extension-inphase will depend on the dynamical nature of the systems and their distribution in phase at the time considered. In the case of conservative systems, with which we shall be principally concerned, their dynamical nature is completely determined by the function which expresses the energy (e) in terms of the p’s, q’s, and a’s (a function supposed identical for all the systems); in the more general case which we are considering, the dynamical nature of the systems is determined by the functions which express the kinetic energy (ep) in terms of the p’s and q’s, and the forces in terms of the q’s and a’s. The distribution in phase is expressed for the time considered by D as function of the p’s and q’s. To find the value of dD/dt for the specified element of extension-in-phase, we observe that the number of systems within the limits can only be varied by systems passing the limits, which may take place in 4 n different ways, viz., by the px of a system passing the limit px, or the limit p", or by the q1 of a system passing the limit qx, or the limit qx \ etc. Let us consider these cases separately.DENSITY-IN-PHASE. 7 In the first place, let us consider the number of systems which in the time dt pass into or out of the specified element by px passing the limit p/. It wiil be convenient, and it is evidently allowable, to suppose dt so small that the quantities px dt, qx dt, etc., which represent the increments of p1, qx, etc., in the time dt shall be infinitely small in comparison with the infinitesimal differences p/' — p{, q" — q{, etc., which determine the magnitude of the element of extension-in-phase. The systems for which p1 passes the limit p{ in the interval dt are those for which at the commencement of this interval the value of pt lies between pxf and p{ — px dt, as is evident if we consider separately the cases in which px is positive and negative. Those systems for which px lies between these limits, and the other p’s and j’s between the limits specified in (9) , will therefore pass into or out of the element considered according as p is positive or negative, unless indeed they also pass some other limit specified in (9) during the same interval of time. But the number which pass any two of these limits will be represented by an expression containing the square of dt as a factor, and is evidently negligible, when dt is sufficiently small, compared with the number which we are seeking to evaluate, and which (with neglect of terms containing dt2) may be found by substituting pt dt for — p{ in (10) or for dpt in (11). The expression Dpx dt dp2. . . dp„ dqi . . . dqn (13) will therefore represent, according as it is positive or negative, the increase or decrease of the number of systems within the given limits which is due to systems passing the limit p/. A similar expression, in which however D and p will have slightly different values (being determined for pt" instead of px'), will represent the decrease or increase of the number of systems due to the passing of the limit px". The difference of the two expressions, or ¿Pi - • ■ dpn dq1. . . dqn dt (14)8 CONSERVATION OF will represent algebraically the decrease of the number of systems within the limits due to systems passing the limits and pi'. The decrease in the number of systems within the limits due to systems passing the limits and q{r may be found in the same way. This will give (~(^ + “dpx.. . dpn dqj . . . dqn dt (15) tor tne decrease due to passing tne tour limr But since the equations of motion (3) give dpi + dqx " U’ the expression reduces to (16) jdD \dpi Pi + dD dq- ) • \ , - qij dpi . dpn dqx . dqn dt. (17) If we prefix 2 to denote summation relative to the suffixes 1 . . . %, we get the total decrease in the number of systems within the limits in the time dt. That is, / dD • dD • \ 7 , j j. 2 ( ¿r^Pi + ^ ) dpx... dpn dqx... dqn dt = — dD dpi . . . dpn dqx. . . dqn, (18) where the suffix applied to the differential coefficient indicates that the p*s and qys are to be regarded as constant in the differentiation. The condition of statistical equilibrium is therefore %{^k + wJ')=0- (20) If at any instant this condition is fulfilled for all values of the pys and ^’s, (dD/dt)p^q vanishes, and therefore the condition will continue to hold, and the distribution in phase will be permanent, so long as the external coordinates remain constant. But the statistical equilibrium would in general be disturbed by a change in the values of the external coordinates, whichDENSITY-IN-PHASE. 9 would alter the values of the p’s as determined by equations (3), and thus disturb the relation expressed in the last equation. If we write equation (19) in the form (dD • 7 dD • 7 \ %w^dt + ^dt) :0, (21) it will be seen to express a theorem of remarkable simplicity. Since 2> is a function of t, p1? ... pn, qx,... its complete differential will consist of parts due to the variations of all these quantities. Now the first term of the equation represents the increment of D due to an increment of t (with constant values of then’s and q’s), and the rest of the first member represents the increments of D due to increments of the p’s and q’s, expressed by pt dt, qx dt, etc. But these are precisely the increments which the p’s and q’s receive in the movement of a system in the time dt. The whole expression represents the total increment of I) for the varying phase of a moving system. We have therefore the theorem: — In an ensemble of mechanical systems identical in nature and subject to forces determined by identical laws, but distributed in phase in any continuous manner, the density-in-phase is constant in time for the varying phases of a moving system ; provided, that the forces of a system are functions of its coordinates, either alone or with the time.* This may be called the principle of conservation of density-in-phase. It may also be written where a,. ,.h represent the arbitrary constants of the integral equations of motion, and are suffixed to the differential eo- * The condition that the forces Fx,... Fn are functions of qx, ... qn and alf a2, etc., which last are functions of the time, is analytically equivalent to the condition that Flt ...Fn are functions of qlf ...qn and the time. Explicit mention of the external coordinates, alf a2y etc., has been made in the preceding pages, because our purpose will require us hereafter to consider these coordinates and the connected forces, Alt A2, etc., which represent the action of the systems on external bodies.10 CONSERVATION OF efficient to indicate that they are to be regarded as constant in the differentiation. We may give to this principle a slightly different expression. Let us call the value of the integral (23) taken within any limits the extension-in-phase within those limits. When the phases bounding an extension-m-phase vary in the course of time according to the dynamical laws of a system subject to forces which are functions of the coordinates either alone or with the time, the value of the extension-in-phase thus bounded remains constant. In this form the principle may be called the principle of conservation of extension-in-phase. In some respects this may be regarded as the most simple statement of the principle, since it contains no explicit reference to an ensemble of systems. Since any extension-in-phase may be divided into infinitesimal portions, it is only necessary to prove the principle for an infinitely small extension. The number of systems of an ensemble which fall within the extension will be represented by the integral is* dpi . . . dpn dq± . . If the extension is infinitely small, we may regard D as constant in the extension and write . . dpn dq1 . . . dqn for the number of systems. The value of this expression must be constant in time, since no systems are supposed to be created or destroyed, and none can pass the limits, because the motion of the limits is identical with that of the systems. But we have seen that D is constant in time, and therefore the integralEXTENSION-IN-PHA SE. 11 which we have called the extension-in-phase, is also constant in time.* Since the system of coordinates employed in the foregoing discussion is entirely arbitrary, the values of the coordinates relating to any configuration and its immediate vicinity do not impose any restriction upon the values relating to other configurations. The fact that the quantity which we have called density-in-phase is constant in time for any given system, implies therefore that its value is independent of the coordinates which are used in its evaluation. For let the density-in-phase as evaluated for the same time and phase by one system of coordinates be D/, and by another system D2'. A system which at that time has that phase will at another time have another phase. Let the density as calculated for this second time and phase by a third system of coordinates be D3". Now we may imagine a system of coordinates which at and near the first configuration will coincide with the first system of coordinates, and at and near the second configuration will coincide with the third system of coordinates. This will give — 7>3". Again we may imagine a system of coordinates which at and near the first configuration will coincide with the second system of coordinates, and at and near the * If we regard a phase as represented by a point in space of 2» dimensions, the changes which take place in the course of time in our ensemble of systems will be represented by a current in such space. This current will be steady so long as the external coordinates are not varied. In any case the current will satisfy a law which in its various expressions is analogous to the hydrodynamic law which may be expressed by the phrases conservation of volumes or conservation of density about a moving point, or by the equation dx dy dz dx dy dz 0. The analogue in statistical mechanics of this equation, viz., dPt . d<ïi dp2 dq2 dPi d(li dp2 d92 ^ = 0, may be derived directly from equations (3) or (6), and may suggest such theorems as have been enunciated, if indeed it is not regarded as making them intuitively evident. The somewhat lengthy demonstrations given above will at least serve to give precision to the notions involved, and familiarity with their use.12 EXTENSION-IN-PHASE second configuration will coincide with the third system of coordinates. This will give D2' = Ds,r. We have therefore J>i = A'. It follows, or it may be proved in the same way, that the value of an extension-in-phase is independent of the system of coordinates which is used in its evaluation. This may easily be verified directly. If g^,... qn, Qly... Qn are two systems of coordinates, and px,. . . ph, Px,. . . Pn the corresponding momenta, we have to prove that /•••/* i • • • dpndqx... dqn —^^dPi... dPndQi... dQn, (24) when the multiple integrals are taken within limits consisting of the same phases. And this will be evident from the principle on which we change the variables in a multiple integral, if we prove that ^(Pj? • * « Pn? Qll • » • Qn) _ ^ d(pU •• .Pn,21, • • • In) : (25) where the first member of the equation represents a Jacobian or functional determinant. Since all its elements of the form dQ/dp are equal to zero, the determinant reduces to a product of two, and we have to prove that d(Pj, . . .Pw) . . . Qn) j (26) d(Pu • • -Pn) d(qu . * .qn) ~ ' We may transform any element of the first of these determinants as follows. By equations (2) and (3), and in view of the fact that the Q’s are linear functions of the q’s and therefore of the p’s, with coefficients involving the q’s, so that a differential coefficient of the form dQrjdpy is function of the q’s alone, we get * * The form of the equation d dcp ^ d dep d'Py d Qj; d dpy in (27) reminds us of the fundamental identity in the differential calculus relating to the order of differentiation with respect to independent variables. But it will he observed that here the variables Qx and py are not independent and that the proof depends on the linear relation between the Q'-s and the p’s.IS AN INVARIANT. 13 dPx _ «) d(Qi, ...$„) d(Qu . . . Qn) * The equation to be proved is thus reduced to (29) <*(gl> - • - gn) • • • Qn) __ 1 d(Qu • • • Qn) d(gl9 . . ,qn) 9 (30) which is easily proved by the ordinary rule for the multiplication of determinants. The numerical value of an extension-in-phase will however depend on the units in which we measure energy and time. For a product of the form dp dq has the dimensions of energy multiplied by time, as appears from equation (2), by which the momenta are defined. Hence an extension-in-phase has the dimensions of the nth power of the product of energy and time. In other words, it has the dimensions of the nth power of action, as the term is used in the 6 principle of Least Action.’ If we distinguish by accents the values of the momenta and coordinates which belong to a time tf, the unaccented letters relating to the time t, the principle of the conservation of extension-in-phase may be written dp1...dpndq1.^dqn— I ... I dpx'... dpnr dqS... dqj, (31) or more briefly (32)14 CONSERVATION OF the limiting phases being those which belong to the same systems at the times t and tf respectively. But we have identically d(Pi d(Pi' i9-.-qJ _ d(pu...qn) dipS, .. . qj) d(a, ... h) d(pS, ... qj) d(a, ...h) in connection with equation (33). Since the coordinates and momenta are functions of a, ... A, and t, the determinant in (36) must be a function of the same variables, and since it does not vary with the time, it must be a function of a, ... A alone. We have therefore d(pi,...qn) d(a9 ... A) func. ( J. . f dPl’. .. dqn" = P> j\ . .jdp," . .. dqj>. Now the principle of the conservation of extension-in-phase, which has been proved (viz., in,the second demonstration given above) independently of any reference to an ensemble of systems, requires that the values of the multiple integrals in this equation shall be equal. This gives P" = P'. With reference to an important class of cases this principle may be enunciated as follows. When the differential equations of motion are exactly known, but the constants of the integral equations imperfectly determined, the coefficient of probability of any phase at any time is equal to the coefficient of probability of the corresponding phase at any other time. By corresponding phases are meant those which are calculated for different times from the same values of the arbitrary constants of the integral equations. Since the sum of the probabilities of all possible cases is necessarily unity, it is evident that we must have all J...Jpdp1...dqn = l, (46) where the integration extends over all phases. This is indeed only a different form of the equation all N=I" SDdpi' ■ ■dqnt phases which we may regard as defining NlPROBABILITY OF PHASE, 19 The values of the coefficient and index of probability of phase, like that of the density-in-phase, are independent of the system of coordinates which is employed to express the distribution in phase of a given ensemble. In dimensions, the coefficient of probability is the reciprocal of an extension-in-phase, that is, the reciprocal of the nth power of the product of time and energy. The index of probability is therefore affected by an additive constant when we change our units of time and energy. If the unit of time is multiplied by ct and the unit of energy is multiplied by c€, all indices of probability relating to systems of n degrees of freedom will be increased by the addition of n log ct + n log c€, (47)CHAPTER II. APPLICATION OF THE PRINCIPLE OF CONSERVATION OF EXTENSION-IN-PHASE TO THE THEORY OF ERRORS* Let us now proceed to combine the principle which has been demonstrated in the preceding chapter and which in its different applications and regarded from different points of view has been variously designated as the conservation of density-in-phase, or of extension-in-phase, or of probability of phase, with those approximate relations which are generally used in the ‘theory of errors.’ We suppose that the differential equations of the motion of a system are exactly known, but that the constants of the integral equations are only approximately determined. It is evident that the probability that the momenta and coordinates at the time t! fall between the limits p^ and pxf + dpx, and qx + dqx, etc., may be expressed by the formula ev' dpi! . .. dqj, (48) where r/ (the index of probability for the phase in question) is a function of the coordinates and momenta and of the time. Let Qx, P/, etc. be the values of the coordinates and momenta which give the maximum value to ?/, and let the general value of r)f be developed by Taylor’s theorem according to ascending powers and products of the differences px — Px, 9.1 ~ Qii e^c*’ an(i us suppose that we have a sufficient approximation without going beyond terms of the second degree in these differences. We may therefore set *' = c- F’} (49) where c is independent of the differences px — P/, q{ — etc., and Fr is a homogeneous quadratic function of theseTHEORY OF ERRORS. 21 differences. The terms of the first degree vanish in virtue of the maximum condition, which also requires that Fr must have a positive value except when all the differences mentioned vanish. If we set C=ec, (50) we may write for the probability that the phase lies within the limits considered der* dPl> . . . dqj. (51) 0 is evidently the maximum value of the coefficient of probability at the time considered. In regard to the degree of approximation represented by these formulae, it is to be observed that we suppose, as is usual in the ‘theory of errors,’ that the determination (explicit or implicit) of the constants of motion- is of such precision that the coefficient of probability e*' or Ce~F' is practically zero except for very small values of the differences Plf — P/, qxr — Qi) etc. For very small values of these differences the approximation is evidently in general sufficient, for larger values of these differences the value of Oe~Ff will be sensibly zero, as it should be, and in this sense the formula will represent the facts. We shall suppose that the forces to which the system is subject are functions of the coordinates either alone or with the time. The principle of conservation of probability of phase will therefore apply, which requires that at any other time (tn) the maximum value of the coefficient of probability shall be the same as at the time t\ and that the phase (Pi, Qi, etc.) which has this greatest probability-coefficient, shall be that which corresponds to the phase (P/, etc.), i. e., which is calculated from the same values of the constants of the integral equations of motion. We may therefore write for the probability that the phase at the time tn falls within the limits Plff and j?/ + dPllf, q" and qf + dq{f, etc., Ce~F" dpj1... dqnn, (52)22 CONSERVATION OF EXTENSION-IN-PHASE where 0 represents the same value as in the preceding formula, viz., the constant value of the maximum coefficient of probability, and F,! is a quadratic function of the differences Pi — ii — Qi, etc., the phase (P1//, Qf etc.) being that which at the time tn corresponds to the phase (P/, etc.) at the time tf. Now we have necessarily J.. .Jce-F'dpS...dqj =J.. .JCe-^’dp,"... dq” = 1, (53) when the integration is extended over all possible phases. It will be allowable to set ± oo for the limits of all the coordinates and momenta, not because these values represent the actual limits of possible phases, but because the portions of the integrals lying outside of the limits of all possible phases will have sensibly the value zero. With ± oo for limits, the equation gives Cirn _ Ctt* Vf “ vr (54) where f is the discriminant * of F\ and f]] that of F!f. This discriminant is therefore constant in time, and like 0 an absolute invariant in respect to the system of coordinates which may be employed. In dimensions, like 0% it is the reciprocal of the 2nth power of the product of energy and time. Let us see precisely how the functions P'and Fff are related. The principle of the conservation of the probability-coefficient requires that any values of the coordinates and momenta at the time tr shall give the function F! the same value as the corresponding coordinates and momenta at the time tn give to F,!. Therefore Fn may be derived from Fr by substituting for Pi, ... qn their values in terms of ... qf. Now we have approximately * This term is used to denote the determinant having for elements on the principal diagonal the coefficients of the squares in the quadratic function and for its other elements the halves of the coefficients of the products in F\AND THEORY OF ERRORS. 23 (55) and as in Fn terms of higher degree than the second are to be neglected, these equations may be considered accurate for the purpose of the transformation required. Since by equation (33) the eliminant of these equations-has the value unity, the discriminant of Fn ‘will be equal to that of F\ as has already appeared from the consideration of the principle of conservation of probability of phase, which is, in fact, essentially the same as that expressed by equation (33). At the time t\ the phases satisfying the equation where h is any positive constant, have the probability-coefficient G e~k. At the time the corresponding phases satisfy the equation and have the same probability-coefficient. So also the phases within the limits given by one or the other of these equations are corresponding phases, and have probability-coefficients greater than G e~k, while phases without these limits have less probability-coefficients. The probability that the phase at the time tf falls within the limits Fr = Tc is the same ag the probability that it falls within the limits Ffr == k at the time tn, since either event necessitates the other. This probability may be evaluated as follows. We may omit the accents, as we need only consider a single time. Let us denote the extension-in-phase within the limits F = Tc by £7, and the probability that tire phase falls within these limits by i2, also the extension-in-phase within the limits F = 1 by Uv We have then by definition F* (56) Fn = k, (57) (58)24 CONSERVATION OF EXTENSION-IN-PHASE Fz=.k e~F dPl... dqn, (59) F=:l Vt =/.. ./«¡y, . . . dqn. (60) But since F is a homogeneous quadratic function of the differences Rl Fly F^y •••!?» Qnl we have identically F—k J. . Jd(Pl - Px) . .. d(qn - Qn) kF—k ==^^di^p-L — Px) . .. d(g» — Qn) F—l = Ain^d(/>i —* Fi) . • • d(([n (?»)• That is U-kn TTU (61) whence dIJ — TJxn P*-1 dk. (62) But if k varies, equations (58) and (59) give F= k+dk dU =J*. . .J*dpx . . . dqn F—k (63) F— k-\-dk dR =J\. .J* C e~F dp± . .. dqn (64) F=k Since the factor <7 (78) d(ra, ... r2n) d(r3, ...r2n) which may be integrated by quadratures and gives V as functions of r19 r2, A, and thus as function of r19 . . . r2n. This integration gives us the last of the arbitrary constants which are functions of the coordinates and momenta without the time. The final integration, which introduces the remain-AND THEORY OF INTEGRATION. 29 ing constant (a), is also a quadrature, since the equation to be integrated may be expressed in the form dt = F (ri) dr±. Now, apart from any such considerations as have been adduced, if we limit ourselves to the changes which take place in time, we have identically r2 drx — rx dr2 = 0, and rt and r2 are given in terms of rv ... r2n by the differential equations of motion. When we have obtained 2 n — 2 integral equations, we may regard r2 and as known functions of rt and r2. The only remaining difficulty is in integrating this equation. If the case is so simple as to present no difficulty, or if we have the skill or the good fortune to perceive that the multiplier i d(c, ... h) ’ (79) d(Y8, ••• ^*2») or any other, will make the first member of the equation an exact differential, we have no need of the rather lengthy considerations which have been adduced. The utility of the principle of conservation of extension-in-phase is that it supplies a 4 multiplier ’ which renders the equation integrable, and which it might be difficult or impossible to find otherwise. It will be observed that the function represented by V is a particular case of that represented by b. The system of arbitrary constants a, b', c ... h has certain properties notable for simplicity. If we write V for b in (77), and compare the result with (78), we get d(n • » - r2n) d(cùj b^ y Gy ... /¿) (80) Therefore the multiple integral da dbf de ... dh (81)30 CONSERVATION OF EXTENSION-IN-PHASE taken within limits formed by phases regarded as contemporaneous represents the extension-in-phase within those limits. The case is somewhat different when the forces are not determined by the coordinates alone, but are functions of the coordinates with the time. All the arbitrary constants of the integral equations must then be regarded in the general case as functions of rv ... r%n9 and t. We cannot use the principle of conservation of extension-in-phase until we have made 2n — l integrations. Let us suppose that the constants A, ... A have been determined by integration in terms of rv ... r2n, and t, leaving a single constant (a) to be thus determined. Our 2n — 1 finite equations enable us to regard all the variables rv ... r2n as functions of a single one, say rv For constant values of A,... A, we have dr. d?\ — da + rx dt. (82) Now J. . .J*da dr2 . . . dr2n =J...Jdrx... dr2n = f.. . ff' r*p da ... dh J J d(a, ... A) = f.. ■ Hr ■ ■ ■ $ sf’ dud,.... dr, J J d(a, ... A) d(r2, . . . r2n) where tlie lhnits of the integrals are formed by the same phases. We have therefore dri __ d(r,, . . ■ r2jn) d(b, . . . ti) da d{a, ... A) d(r2y . . . r2r) 9 (83) by which equation (82) may be reduced to the form d{rl9 d(ay _____j________________2_____ d(b9... A) 1 d(b9 ...h) d(r2y ... r2n) d(r2, . . . r2n) (84) Now we know by (71) that the coefficient of da is a function of a,... A. Therefore, as A, ... A are regarded as constant in the equation, the first number represents the differentialAND THEORY OF INTEGRATION. 31 of a function of a, ... A, which we may denote by a'. We have then da> ~ d(b,... h) dVl d(b, !.. h) dt> (85) d(rs, ...»•«*) d(r2) ...r2n) which may be integrated by quadratures. In this case we may say that the principle of conservation of extension-inphase has supplied the * multiplier ’ 1 d(b, ...h) (86) d{r2, ... r2n) for the integration of the equation dr-i — rxdt = 6. (87) The system of arbitrary constants a', 6,... Ti has evidently the same properties which were noticed in regard to the system a, b\ ... h.CHAPTER IV. ON THE DISTRIBUTION IN PHASE CALLED CANONICAL, IN WHICH THE INDEX OF PROBABILITY IS A LINEAR FUNCTION OF THE ENERGY. Let us now give our attention to the statistical equilibrium of ensembles of conservation systems, especially to those cases and properties which promise to throw light on the phenomena of thermodynamics. The condition of statistical equilibrium may be expressed in the form * /¿p. dp ' \dpiPl + dq. (88) where P is the coefficient of probability, or the quotient of the density-in-phase by the whole number of systems. To satisfy this condition, it is necessary and sufficient that P should be a function of the y>’s and q*s (the momenta and coordinates) which does not vary with the time in a moving system. In all cases which we are now considering, the energy, or any function of the energy, is such a function. P = func. (e) will therefore satisfy the equation, as indeed appears identically if we write it in the form 2 \ _0 \dqi dp1 dpi dq1) There are, however, other conditions to which P is subject, which are not so much conditions of statistical equilibrium, as conditions implicitly involved in the definition of the coeffi- * See equations (20), (41), (42), also the paragraph following equation (20). The positions of any external bodies which can affect the systems are here supposed uniform for all the systems and constant in time.CANONICAL DISTRIBUTION. 33 cient of probability, whether the case is one of equilibrium or not. These are: that P should be single-valued, and neither negative nor imaginary for any phase, and that expressed by equation (46) , viz., all /•••/ Pdpx . .. dqn = 1. phases These considerations exclude (89) P = € X constant, as well as P = constant, as cases to be considered. The distribution represented by , = log P = t=i, (90) jP ~ e 0 , (91) where ® and ^ are constants, and © positive, seems to represent the most simple case conceivable, since it has the property that when the system consists of parts with separate energies, the laws of the distribution in phase of the separate parts are of the same nature,— a property which enormously simplifies the discussion, and is the foundation of extremely important relations to thermodynamics. The case is not rendered less simple by the divisor ©, (a quantity of the same dimensions as €,) but the reverse, since it makes the distribution independent of the units employed. The negative sign of e is required by (89), which determines also the value of ^ for any given ©, viz., if/ all e e~®dPl ...dqK. (92) phase* When an ensemble of systems is distributed in phase in the manner described, i. e., when the index of probability is a34 CANONICAL DISTRIBUTION linear function of the energy, we shall say that the ensemble is canonically distributed, and shall call the divisor of the energy (©) the modulus of distribution. The fractional part of an ensemble canonically distributed which lies within any given limits of phase is therefore represented by the multiple integral taken within those limits. We may express the same thing by saying that the multiple integral expresses the probability that an unspecified system of the ensemble (i. e., one of which we only know that it belongs to the ensemble) falls within the given limits. Since the value of a multiple integral of the form (23) (which we have called an extension-in-phase) bounded by any given phases is independent of the system of coordinates by which it is evaluated, the same must be true of the multiple integral in (92), as appears at once if we divide up this integral into parts so small that the exponential factor may be regarded as constant in each. The value of yjr is therefore independent of the system of coordinates employed. It is evident that yjr might be defined as the energy for which the coefficient of probability of phase has the value unity. Since however this coefficient has the dimensions of the inverse nth. power of the product of energy and time,* the energy represented by ^ is not independent of the units of energy and time. But when these units have been chosen, the definition of ^ will involve the same arbitrary constant as e, so that, while in any given case the numerical values of or e will be entirely indefinite until the zero of energy has also been fixed for the system considered, the difference ^ — e will represent a perfectly definite amount of energy, which is entirely independent of the zero of energy which we may choose to adopt. (93) * See Chapter I, p. 19.OF AN ENSEMBLE OF SYSTEMS. 35 It is evident that the canonical distribution is entirely determined by the modulus (considered as a quantity of energy) and the nature of the system considered, since when equation (92) is satisfied the value of the multiple integral (93) is independent of the units and of the coordinates employed, and of the zero chosen for the energy of the system. In treating of the canonical distribution, we shall always suppose the multiple integral in equation (92) to have a finite value, as otherwise the coefficient of probability vanishes, and the law of distribution becomes illusory. This will exclude certain cases, but not such apparently, as will affect the value of our results with respect to their bearing on thermodynamics. It will exclude, for instance, cases in which the system or parts of it can be distributed in unlimited space (or in a space which has limits, but is still infinite in volume), while the energy remains beneath a finite limit. It also excludes many cases in which the energy can decrease without limit, as when the system contains material points which attract one another inversely as the squares of their distances. Cases of material points attracting each other inversely as the distances would be excluded for some values of ®, and not for others. The investigation of such points is best left to the particular cases. For the purposes of a general discussion, it is sufficient to call attention to the assumption implicitly involved in the formula (92).* The modulus © has properties analogous to those of temperature in thermodynamics. Let the system A be defined as one of an ensemble of systems of m degrees of freedom distributed in phase with a probability-coefficient ^a~€a e © , * It will be observed that similar limitations exist in thermodynamics. In order that a mass of gas can be in thermodynamic equilibrium, it is necessary that it be enclosed. There is no thermodynamic equilibrium of a (finite) mass of gas in an infinite space. Again, that two attracting particles should be able to do an infinite amount of work in passing from one configuration (which is regarded as possible) to another, is a notion which, although perfectly intelligible in a mathematical formula, is quite foreign to our ordinary conceptions of matter.36 CANONICAL DISTRIBUTION and the system B as one of an ensemble of systems of n degrees of freedom distributed in phase with a probabilitj^-coefficient ^iT~€b e @ , which has the same modulus. Let qv . . .qm pv .. . pm be the coordinates and momenta of A, and qm+1, ... qm+n, pm+1,... pm+n those of B. Now we may regard the systems A and B as together forming a system C7, having m + n degrees of freedom, and the coordinates and momenta qv ... pv ... pm+n* The probability that the phase of the system ¿7, as thus defined, will fall within the limits dpx, . .. dpm+n, dqx, ... dqm+n is evidently the product of the probabilities that the systems A and B will each fall within the specified limits, viz., ^a+^b-€a-€b e © dpt. . . dpm+n dqi . . . dqm+n. (94) We may therefore regard G as an undetermined system of an ensemble distributed with the probability-coefficient e © } (95) an ensemble which might be defined as formed by combining each system of the first ensemble with each of the second. But since eA + eB is the energy of the whole system, and yfrA and yfr B are constants, the probability-coefficient is of the general form which we are considering, and the ensemble to which it relates is in statistical equilibrium and is canonically distributed. This result, however, so far as statistical equilibrium is concerned, is rather nugatory, since conceiving of separate systems as forming a single system does not create any interaction between them, and if the systems combined belong to ensembles in statistical equilibrium, to say that the ensemble formed by such combinations as we have supposed is in statistical equilibrium, is only to repeat the data in differentOF AN ENSEMBLE OF SYSTEMS. 37 words. Let us therefore suppose that in forming the system 0 we add certain forces acting between A and B, and having the force-function — eAB. The energy of the system 0 is now eAJreBJr eAB, and an ensemble of such systems distributed with a density proportional to e © (96) would be in statistical equilibrium. Comparing this with the probability-coefficient of 0 given above (95), we see that if we suppose eAB (or rather the variable part of this term when we consider all possible configurations of the systems A and B) to be infinitely small, the actual distribution in phase of C will differ infinitely little from one of statistical equilibrium, which is equivalent to saying that its distribution in phase will vary infinitely little even in a time indefinitely prolonged.* The case would be entirely different if A and B belonged to ensembles having different moduli, say ®A and ®B. The probability-coefficient of 0 would then be *a-€a + +b-€b (97) which is not approximately proportional to any expression of the form (96). Before proceeding farther in the investigation of the distribution in phase which we have called canonical, it will be interesting to see whether the properties with respect to * It will be observed that the above condition relating to the forces which act between the different systems is entirely analogous to that which must hold in the corresponding case in thermodynamics. The most simple test of the equality of temperature of two bodies is that they remain in equilibrium when brought into thermal contact. Direct thermal contact implies molecular forces acting between the bodies. Now the test will fail unless the energy of these forces can be neglected in comparison with the other energies of the bodies. Thus, in the case of energetic chemical action between the bodies, or when the number of particles affected by the forces acting between the bodies is not negligible in comparison with the whole number of particles (as when the bodies have the form of exceedingly thin sheets), the contact of bodies of the same temperature may produce considerable thermal disturbance, and thus fail to afford a reliable criterion of the equality of temperature.38 OTHER DISTRIBUTIONS statistical equilibrium which have been described are peculiar to it, or whether other distributions may have analogous properties. Let rf and rf' be the indices of probability in two independent ensembles which are each in statistical equilibrium, then rf + rf’ will be the index in the ensemble obtained by combining each system of the first ensemble with each system of the second. This third ensemble will of course be in statistical equilibrium, and the function of phase rf -f rfr will be a constant of motion. Now when infinitesimal forces are added to the compound systems, if rf + rfr or a function differing infinitesimally from this is still a constant of motion, it must be on account of the nature of the forces added, or if their action is not entirely specified, on account of conditions to which they are subject. Thus, in the case, already considered, rf -f rff is a function of the energy of the compound system, and the infinitesimal forces added are subject to the law of conservation of energy. Another natural supposition in regard to the added forces is that they should be such as not to affect the moments of momentum of the compound system. To get a case in which moments of momentum of the compound system shall be constants of motion, we may imagine material particles contained in two concentric spherical shells, being prevented from passing the surfaces bounding the shells by repulsions acting always in lines passing through the common centre of the shells. Then, if there are no forces acting between particles in different shells, the mass of particles in each shell will have, besides its energy, the moments of momentum about three axes through the centre as constants of motion. Now let us imagine an ensemble formed by distributing in phase the system of particles in one shell according to the index of probability (98) where e denotes the energy of the system, and coj, ©2, g>3 , its three moments of momentum, and the other letters constants.HAVE ANALOGOUS PROPERTIES. 39 In like manner let ns imagine a second ensemble formed by distributing in phase the system of particles in the other shell according to the index ¿'-1 + 21 + 5!*. + !* ® ^ ih ^ ^ o3 * (99) where the letters have similar significations, and 0, i2x, il2, il3 the same values as in the preceding formula. Each of the two ensembles will evidently be in statistical equilibrium, and therefore also the ensemble of compound systems obtained by combining each system of the first ensemble with each of the second. In this third ensemble the index of probability will be 4 + ^/_L^ + i2L^ + il^+St + jiL> (100) © Ox Qb where the four numerators represent functions of phase which are constants of motion for the compound systems. Now if we add in each system of this third ensemble infinitesimal conservative forces of attraction or repulsion between particles in different shells, determined by the same law for all the systems, the functions co1 + co\ co2 + and / • • • 2«)J d(q, ...qn) (139) These determinants are all functions of the q’s alone.* The last is evidently the Hessian or determinant formed of the second differential coefficients of the kinetic energy with respect to q1, ... qn. We shall denote it by AThe reciprocal determinant d(ji » - » gn) d(jPi • • • Pnf which is the Hessian of the kinetic energy regarded as function of the p’s, we shall denote by Ap. If we set e e © dp,... dpn +00 -f If +00 2© dux . . . dun = (27r©)% (140) and f, = l/r — (141) * It will be observed that the proof of (137) depends on the linear relation . dur between the w’s and a’s, which makes —r- constant with respect to the differ- dqx 1 entiations here considered. Compare note on p. 12,54 AVERAGE VALUES IN A CANONICAL the fractional part of the ensemble which lies within any-given limits of configuration (136) may be written »«-*« i ! e A: dqi dqn (142) where the constant may be determined by the condition that the integral extended oyer all configurations has the value unity.* * In the simple but important case in which Aj is independent of the g’s, and eq a quadratic function of the q% if we write ea for the least value of €g (or of e) consistent with the given values of the external coordinates, the equation determining \J/g may be written «a-*« +» +» e ® = A Jy . • -fe * ¿Ji • • • dqn- If we denote by q{t ... qn' the values of qlt... qn which give eq its least value €aj it is evident that eg — ea is a homogenous quadratic function of the differences q1 — q{> etc., and that dqly... dqn may be regarded as the differentials of these differences. The evaluation of this integral is therefore analytically similar to that of the integral +00 +00 ___*P f ' . -fe 0 dpx .. . dpn for which we have found the value Ap■ (2ir0)*. By the same method, or by analogy, we get -V'j ■"-gd1 (2ir0)* where Aq is the Hessian of the potential energy as function of the • • •Pn and Q1, ... Qn> Px,... Pn are two systems of coordinates and momenta.* It follows that (149) or (150) * See equation (29).AND EXTENSION IN VELOCITY. 59 and i• • • •?» r cf ^p, J J \d(P1,...Pn)J = r... r(d(Pi».dp ...dpn J J \d(pu...pn)) d(pl,...pn) ~J"’J\d(P1>...pJ [d(Pl,...pn)J {■d(Q1>...Qn)J Pl"‘ Pp The multiple integral j • • *j dpi • •. dpnd(fa • • • dqn9 (151) which may also be written JJ1 • • • dqndqx . . . dqn, (152) and which, when taken within any given limits of phase, has been shown to have a value independent of the coordinates employed, expresses what we have called an extension-inphase.* In like manner we may say that the multiple integral (148) expresses an extension-in-configuration, and that the multiple integrals (149) and (150) express an extensionAn-velocity. We have called dpx . . . dpndq1 . . . dqn9 (153) which is equivalent to A-dq! . . . dqndqi . . . dqn, (154) an element of extension-in-phase. We may call A fdqt ...dqa (155) an element of extension-in-configuration, and A}dPl. . . dpn, (156) # See Chapter I, p. 10.60 EXTENSION IN CONFIGURATION or its equivalent .. .dqn, (157) an element of extension-in-veloeity. An extension-in-phase may always be regarded as an integral of elementary extensions-in-configuration multiplied each by an extension-in-velocity. This is evident from the formulae (151) and (152) which express an extension-in-phase, if we imagine the integrations relative to velocity to be first carried out. The product of the two expressions for an element of extension-in-velocity (149) and (150) is evidently of the same dimensions as the product Pi • • • Pni 1 • • • in that is, as the nth power of energy, since every product of the form px qx has the dimensions of energy. Therefore an exten-sion-in-velocity has the dimensions of the square root of the nth power of energy. Again we see by (155) and (156) that the product of an extension-in-configuration and an extension-in-velocity have the dimensions of the nth power of energy multiplied by the nth power of time. Therefore an extension-in-configuration has the dimensions of the nth power of time multiplied by the square root of the nth power of energy. To the notion of extension-in-configuration there attach themselves certain other notions analogous to those which have presented themselves in connection with the notion of ex-tension-in-phase. The number of systems of any ensemble (whether distributed canonically or in any other manner) which are contained in an element of extension-in-configura-tion, divided by the numerical value of that element, may be called the density-in-conjiguration. That is, if a certain configuration is specified by the coordinates qx... qn, and the number of systems of which the coordinates fall between the limits q1 and q1 + dq1,... qn and qn + dqn is expressed by DqA'^dql . . . dqn, (158)AND EXTENSION IN VELOCITY. 61 Dq will be the density-in-configuration. And if we set eV, = W’ (!59) where W denotes, as usual, the total number of systems in the ensemble, the probability that an unspecified system of the ensemble will fall within the given limits of configuration, is expressed by eVqA fdqt . . . dqn. (160) We may call e1? the coefficient of probability of the configuration^ and 7jq the index of probability of the configuration. The fractional part of the whole number of systems which are within any given limits of configuration will be expressed by the multiple integral J - ■S&riq^dcix • • • dq’1' (161) The value of this integral (taken within any given configurations) is therefore independent of the system of coordinates which is used. Since the same has been proved of the same integral without the factor e^, it follows that the values of r)q and Dq for a given configuration in a given ensemble are independent of the system of coordinates which is used. The notion of extension-in-velocity relates to systems having the same configuration.* If an ensemble is distributed both in configuration and in velocity, we may confine our attention to those systems which are contained within certain infinitesimal limits of configuration, and compare the whole number of such systems with those which are also contained * Except in some simple cases, such as a system of material points, we cannot compare velocities in one configuration with velocities in another, and speak of their identity or difference except in a sense entirely artificial. We may indeed say that we call the velocities in one configuration the same as those in another when the quantities q1, ... qn have the same values in the two cases. But this signifies nothing until the system of coordinates has been defined. We might identify the velocities in the two cases which make the quantities pi, .. .pn the same in each. This again would signify nothing independently of the system of coordinates employed.62 EXTENSION IN CONFIGURATION within certain infinitesimal limits of velocity. The second of these numbers divided by the first expresses the probability that a system which is only specified as falling within the infinitesimal limits of configuration shall also fall within the infinitesimal limits of velocity. If the limits with respect to velocity are expressed by the condition that the momenta shall fall between the limits p1 and p1 + dp1,... pn and p„ + dpn, the extension-in-velocity within those limits will be LXp . . . L0JJn) and we may express the probability in question by ehpA^dp1 . .. dpn. (162) This may be regarded as defining rjp. The probability that a system which is only specified as haying a configuration within certain infinitesimal limits shall also fall within any given limits of velocity will be expressed by the multiple integral s~s* A}dp1...dp„, (163) or its equivalent /• . J\dqt . . . dq„, (164) taken within the given limits. It follows that the probability that the system will fall within the limits of velocity, qt and + dq19 . . . qn and qn + dqn is expressed by (165) The value of the integrals (163), (164) is independent of the system of coordinates and momenta which is used, as is also the value of the same integrals without the factor e1?; therefore the value of rjp must be independent of the system of coordinates and momenta. We may call e1* the coefficient of probability of velocity, and 7}p the index of probability of velocity.AND EXTENSION IN VELOCITY. 63 Comparing (160) and (162) with (40), we get eV* = P = e* (166) or rjq + = 7}. (167) That is: the product of the coefficients of probability of configuration and of velocity is equal to the coefficient of probability of phase; the sum of the indices of probability of configuration and of velocity is equal to the index of probability of phase. It is evident that e1* and e*» have the dimensions of the reciprocals' of extension-in-configuration and extension-invelocity respectively, i. e., the dimensions of tr* e~* and e~*, where t represent any time, and € any energy. If, therefore, the unit of time is multiplied by *, and the unit of energy by ce, every rjq will be increased by the addition of n log ct + in log ce, (168) and every rjp by the addition of in log*.* (169) It should be observed that the quantities which have been called extension-in-configuration and extension-in-velocity are not, as the terms might seem to imply, purely geometrical or kinematical conceptions. To express their nature more fully, they might appropriately have been called, respectively, the dynamical measure of the extension in configuration, and the dynamical measure of the extension in velocity. They depend upon the masses, although not upon the forces of the system. In the simple case of material points, where each point is limited to a given space, the extension-in-configuration is the product of the volumes within which the several points are confined (these may be the same or different), multiplied by the square root of the cube of the product of the masses of the several points. The extension-in-velocity for such systems is most easily defined as the extension-in-configuration of systems which have moved from the same configuration for the unit of time with the given velocities. * Compare (47) in Chapter I.64 EXTENSION IN CONFIGURATION In the general case, the notions of extension-in-configuration and extension-in-velocity may be connected as follows. If an ensemble of similar systems of n degrees of freedom have the same configuration at a given instant, but are distributed throughout any finite extension-in-velocity, the same ensemble after an infinitesimal interval of time St will be distributed throughout an extension in configuration equal to its original extension-in-velocity multiplied by Btn. In demonstrating this theorem, we shall write g/,. . . qnr for the initial values of the coordinates. The final values will evidently be connected with the initial by the equations qx - qxf = ¿Si, ...& — &/ = ¿Si. (170) Now the original extension-in-velocity is by definition represented by the integral J"SAqid'h • • •dqn’ (i71) where the limits may be expressed by an equation of the form = (172) The same integral multiplied by the constant Btn may be written J• f AM^SO, • • • %»&), (173) and the limits may be written • • • ¿) = f(ii Si, . . . qn St) = 0. (174) (It will be observed that St as well as is constant in the integrations.) Now this integral is identically equal to /• •/A,* <*(?! - qx') qn'), (175) or its equivalent f "‘f • • • dqn, (176) with limits expressed by the equation /(& - qx, •••?»- 2») = o. (177)AND EXTENSION IN VELOCITY. 65 But the systems which initially had velocities satisfying the equation (172) will after the interval St have configurations satisfying equation (177). Therefore the extension-in-con-figuration represented by the last integral is that which belongs to the systems which originally had the extension-in-velocity represented by the integral (171). Since the quantities which we have called extensions-in-phase, extensions-in-configuration, and extensions-in-velocity are independent of the nature of the system of coordinates used in their definitions, it is natural to seek definitions which shall be independent of the use of any coordinates. It will be sufficient to give the following definitions without formal proof of their equivalence with those given above, since they are less convenient for use than those founded on systems of coordinates, and since we shall in fact have no occasion to use them. We commence with the definition of extension-in-velocity. We may imagine n independent velocities, V19... Vn of which a system in a given configuration is capable. We may conceive of the system as having a certain velocity V0 combined with a part of each of these velocities Vt... Vn. By a part of Vt is meant a velocity of the same nature as V1 but in amount being anything between zero and Vv Now all the velocities which may be thus described may be regarded as forming or lying in a certain extension of which we desire a measure. The case is greatly simplified if we suppose that certain relations exist between the velocities V1,... Vn1 viz : that the kinetic energy due to any two of these velocities combined is the sum of the kinetic energies due to the velocities separately. In this case the extension-in-motion is the square root of the product of the doubled kinetic energies due to the n velocities Vx,... Vn taken separately. The more general case may be reduced to this simpler case as follows. The velocity V2 may always be regarded as composed of two velocities Vj and V2n, of which V2f is of the same nature as Vx, (it may be more or less in amount, or opposite in sign,) while V2n satisfies the relation that the66 EXTENSION IN CONFIGURATION kinetic energy due to V1 and combined is the sum of the kinetic energies due to these velocities taken separately. And the velocity Vz may be regarded as compounded of three, V/j Vz, Vz\ of which Vz is of the same nature as V1, VB" of the same nature as F2", while Vzr satisfies the relations that if combined either with Fx or V2 the kinetic energy of the combined velocities is the sum of the kinetic energies of the velocities taken separately. When all the velocities V2,... Vn have been thus decomposed, the square root of the product of the doubled kinetic energies of the several velocities Vx, F2", Vs"\ etc., will be the value of the extension-in-velocity which is sought. This method of evaluation of the extension-in-velocity which we are considering is perhaps the most simple and natural, but the result may be expressed in a more symmetrical form. Let us write e12 for the kinetic energy of the velocities V1 and V2 combined, diminished by the sum of the kinetic energies due to the same velocities taken separately. This may be called the mutual energy of the velocities Vx and F2. Let the mutual energy of every pair of the velocities Fx,... Vn be expressed in the same way. Analogy would make en represent the energy of twice Fx diminished by twice the energy of Fx, i. e., €1X would represent twice the energy of Vt, although the term mutual energy is hardly appropriate to this case.- At all events, let en have this signification, and e22 represent twice the energy of V2, etc. The square root of the determinant 6xi 612 •• • eln 621 622 •• • €2n ^»1 €n2 • • • €nn represents the value of the extension-in-velocity determined as above described by the velocities V1,... Vn. The statements of the preceding paragraph may be readily proved from the expression (157) on page 60, viz., dq± . . . dqn by which the notion of an element of extension-in-velocity wasAND EXTENSION IN VELOCITY. 67 originally defined. Since in this expression represents the determinant of which the general element is d2e dq'idqj the square of the preceding expression represents the determinant of which the general element is d2e . . Now we may regard the differentials of velocity dq{, dty as themselves infinitesimal velocities. Then the last expression represents the mutual energy of these velocities, and represents twice the energy due to the velocity dqt. The case which we have considered is an extension-in-velocity of the simplest form. All extensions-in-velocity do not have this form, but all may be regarded as composed of elementary extensions of this form, in the same manner as all volumes may be regarded as composed of elementary parallelepipeds. Having thus a measure of extension-in-velocity founded, it will be observed, on the dynamical notion of kinetic energy, and not involving an explicit mention of coordinates, we may derive from it a measure of extension-in-configuration by the principle connecting these quantities which has been given in a preceding paragraph of this chapter. The measure of extension-in-phase may be obtained from that of extension-in-configuration and of extension-in-velocity. For to every configuration in an extension-in-phase there will belong a certain extension-in-velocity, and the integral of the elements of extension-in-configuration within any extension-in-phase multiplied each by its extension-in-velocity is the measure of the extension-in-phase.CHAPTER VIL FARTHER DISCUSSION OF AVERAGES IN A CANONICAL ENSEMBLE OF SYSTEMS. Returning to the case of a canonical distribution, we have for the index of probability of configuration V © (178) as appears on comparison of formulae (142) and (161). It follows immediately from (142) that the average value in the ensemble of any quantity u which depends on the configuration alone is given by the formula /a11 s* tczîî ... I ue 0 .. . dqny config. (179) where the integrations cover all possible configurations. The value of y\rq is evidently determined by the equation $2 0 ftli Cq =J. . .Je . . . dqn. (180) config. By differentiating the last equation we may obtain results analogous to those obtained in Chapter IV from the equation __r r* aI1 /» _€ e &=J. . Je ®dPl.. . dqn. phases As the process is identical, it is sufficient to give the results: dif/q = rjqd® — Axdax — J~2da2 — etc., (181)AVERAGES IN A CANONICAL ENSEMBLE. 69 or, since (182) and d\f/q ~ -f- 7jqd® ~f* ©dtyg, (183) deg = — ®dr]q — — Ä2da2 — etc. (184) It appears from this equation that the differential relations subsisting between the average potential energy in an ensemble of systems canonically distributed, the modulus of distribution, the average index of probability of configuration, taken negatively, and the average forces exerted on external bodies, are equivalent to those enunciated by Clausius for the potential energy of a body, its temperature, a quantity which he called the disgregation, and the forces exerted on external bodies.* For the index of probability of velocity, in the case of canonical distribution, we have by comparison of (144) and (163), or of (145) and (164), „ _&—%> Vp © (185) which gives i © (186) we have also *p = £ n ®> (187) and by (140), (2U) phases phases *. e., by (108), ♦ , ♦ ?« ® = ©a_l/i^6 ®\ ) (215) Hence e=(®,^yr* (216) and V d®J (217) * This implies that the kinetic and potential energies of individual systems would each separately have values sensibly constant in time. t As an example, we may take a system consisting of a fluid in a cylinder under a weighted piston, with a vacuum between the piston and the top of the cylinder, which is closed. The weighted piston is to be regarded as a part of the system. (This is formally necessary in order to satisfy the condition of the invariability of the external coordinates.) It is evident that at a certain temperature, viz., when the pressure of saturated vapor balances the weight of the piston, there is an indeterminateness in the values of the potential and total energies as functions of the temperature.76 AVERAGE VALUES IN A CANONICAL For h = 1, this gives ; = _e4(f d® \®, (218) which agrees with (191). From (216) we have also ? = i^FT + = fï + ®2 Ü e*-1, <2© v, T d®J ’ (219) (220) In like manner from the identical equation all config. «4 , n a“ r *1 »Aiidq1...dqn=®*^j ...j ef-'e ®Aqidq1. config. ..dqn, (221) we get +4 / , \ j *4 5?=e9(®'s),’"e- (222) and (223) With respect to the kinetic energy similar equations will hold for averages taken for any particular configuration, or for the whole ensemble. But since the equation 2 - {€p + ®* d@) ' (224) reduces toENSEMBLE OF SYSTEMS. 77 We have therefore (226) (227) *(228) The average values of the powers of the anomalies of the energies are perhaps most easily found as follows. We have identically, since e is a function of ©, while e is a function of the p’s and q% all € ®2 -~e)*e 6 dPU ••■d%n = phases J • • • j j^e (e — e)h — & (e — e)*_1 J e ° (229) phases i. e,~ by (108), *¿[(«-3*«”®] = [e(e-i)»-A(6-e)^®2^] e~\ (230) * In the case discussed in the note on page 64 we may easily get (f? - *<•)* = (jq ~ *a + 03 rf©) which, with n 6g ~ €a = g ©, gives (*2 - €a)h = G®+02 ¿0) - ea>*-1= Hence («? - .«)* = 7?. r d \ . Again <*-*«)* = + which with gives e - 6a = n 0 (€ - €a)h = (n0 + 02^) (* ~ «a)*“1 = n(«0 + 02gg) ©, -------— r (ft -j- h) ^ (*~*°) ~ 1» ®- hence78 AVERAGE VALUES IN A CANONICAL or since by (218) . -i _$ a© 7 «)* + (e - e)* e = e (e - e)h - h (e -e)*"1 ©* (* - i)A+1 = ^ d®(e~e)h + ®2 M- (231) In precisely the same way we may obtain for the potential energy (ea ~ *)** = ®2 ^ * + *(«,- i,)»-1 ©a §• (232) By successive applications of (231) we obtain (e — e)2 = De (e - e)8 =• Dre (e — e)4 = D8 e + 3 (^2 (7- i)6 = D4€ + lODejD2? (e - ¿)6 = Z>5i + ISJTelFe + 10(D2i)2 + 15(Z>e)8 etc. where i> represents the operator ©2 djd%. Similar expressions relating to the potential energy may be derived from (232). For the kinetic energy we may write similar equations in which the averages may be taken either for a single configuration or for the whole ensemble. But since i?6j) 7b d® ~ 2 the general formula reduces to - ip)*+1 = ©2^ (<* ~ ip)* + hnh& (ep - ep)^ (233) or (eP - ep)w _2© d (ep - ePr [ 2h(ep- ePY 2h (ep -ep)^ i/+1 n d® ep* "» eph ^ n ep*“1 (234)ENSEMBLE OF SYSTEMS. 79 But since identically (ep — ~€p)° __ ^ ——— — ±9 A^-o, c 0 €p € the value of the corresponding expression for any index will be independent of © and the formula reduces to (ep — ep\h+1 2h(ep-ep' K ~eP ) = A i, , )* + —(236) we have therefore (e? - *Y _ lf (e>-~eA3=\, \ ep J »*' (^p ~ £i,y=o, (ep — ep\* _ 48 , 12 \ ip ) n*+ w2’ (ep-epV_2 etc.* \ €p J »’ It will be observed that when ^ or e is given as function of ©, all averages of the form 7 or (e — e)h are thereby deter- * In the case discussed in the preceding foot-notes we get easily («? — €q)h =(ep- *p)h, /*?- €(*Ÿ __ (*p~ €P\h '€q—€a' \ €p ' Tor the total energy we have in this case h /«-lx»., i| (*-* yl. Ve — €a' n Ve— 6a' n \e — €«/ /«-'S.2 1 A 8 4- 6 \€ — €aj ~~ n‘ V€ — g«> 1 _n2 + n®’ etc. Ve -e«/ »2 80 AVERAGE VALUES IN A CANONICAL mined. So also if ^q or eq is given as function of ®, al1 averages of the form e^ or — eqy are determined. But eq — e — Therefore if any one of the quantities yjrq, e, eq is known as function of ©, and n is also known, all averages of any of the forms mentioned are thereby determined as functions of the same variable. In any case all averages of the form are known in terms of n alone, and have the same value whether taken for the whole ensemble or limited to any particular configuration. If we differentiate the equation all f-e J • • • J e 0 dp,.. . . dq„ = 1 phases with respect to av and multiply by ©, we have (236) (237) Differentiating again, with respect to av with respect to a3, and with respect to ©, we have /•••/[S»-1?+&)’]' *^* (238) r fT dy cPe | 1 rdf de \ de \~[ J J \_diii Ja2 dtii da2 0 \diii da1J\dai da,, J J ' dfy d?e 'difr de \ / d\f/ de dax da2 ddi da2 \^dax da1J\ da2 da2 s=? e 0 dp !. ii ’ . daxd® • f # A — ■^1 \dai dax) 'V®*?® ®2 Jj fcf e ® dpi.. . dqn = 0. (240)ENSEMBLE OF SYSTEMS. 81 The multiple integrals in the last four equations represent the average values of the expressions in the brackets, which we may therefore set equal to zero. The first gives d\p __ de d(h\ d(L\ (241) as already obtained. With this relation and (191) we get from the other equations (Ai — A() (A2 3Fe <*¥ \ L da 2 d(h\ da2 J = ©(r^_^A=©/^E_Ç?N) (243) y dels d(i2 J y dcti d&i J (Al e) ~ ®2da1d® ®2 d® = -- ®2 dï] dax* We may add for comparison equation (205), which might be derived from (236) by differentiating twice with respect to ®: (e - e)2 = -®a^ = ©2~. (244) v ' d®* d® K 1 The two last equations give ----------------- J J.------- (A, - At) (e - i) = —(e - €)2. (245) de If or i is known as function of ©, av a2, etc., (e — e)2 may be obtained by differentiation as function of the same variables. And if yfr, or Av or r) is known as function of 0, av etc., (At — (e — e) may be obtained by differentiation. But (Ax — Ax)2 and (.Ax — Ax) (A2 — d2) cannot be obtained in any similar manner. We have seen that (e— e)2 is in general a vanishing quantity for very great values of n9 which we may regard as contained implicitly in © as a divisor. The same is true of (At — At) (e — e). It does not appear that we can assert the same of (Ax — or (Ax — Ax) (A2 — A2), since82 AVERAGE VALUES IN A CANONICAL cPe/da-^ may be very great. The quantities cPe/da^ and dpyjrjda^ belong to the class called elasticities. The former expression represents an elasticity measured under the condition that while ax is varied the internal coordinates . . . qn all remain fixed. The latter is an elasticity measured under the condition that when at is varied the ensemble remains canonically distributed within the same modulus. This corresponds to an elasticity in physics measured under the condition of constant temperature. It is evident that the former is greater than the latter, and it may be enormously greater. The divergences of the force A1 from its average value are due in part to the differences of energy in the systems of the ensemble, and in part to the differences in the value of. the forces which exist in systems of the same energy. If we write J37L f°r the average value of At in systems of the ensemble which have any same energy, it will be determined by the equation where the limits of integration in both multiple integrals are two values of the energy which differ infinitely little, say € and e + de. This will make the factor e 0 constant within the limits of integration, and it may be cancelled in the numerator and denominator, leaving (246) ÿ-e (247) where the integrals as before are to be taken between e and € + de. .271* therefore independent of 0, being a function of the energy and the external coordinates.ENSEMBLE OF SYSTEMS. 83 Now we have identically Ai — Âi = ÇAi — 3Ï]*) + (3i|€ — Âi), where A1 — 3^|6 denotes the excess of the force (tending to increase ¿q) exerted by any system above the average of such forces for systems of the same energy. Accordingly, (Ai — JTi)2 = (At — A-!|€)2 + 2 (-^i — -¿i[eX-ile -7 -Â) + (^lje ““ Ax)2. But the average value of (Ax — A[)6) (At\€ — A{) for systems of the ensemble which have the same energy is zero, since for such systems the second factor is constant. Therefore the average for the whole ensemble is zero, and {Ax - lxy = (A, - Ale)2 + (M ~ Aù2- (248) In the same way it may be shown that (Ai — Aj) (€ — €) = (Aï|€ — 11) (€ — i). (249) It is evident that in ensembles in which the anomalies of energy e —• e may be regarded as insensible the same will be true of the quantities represented by A~[\e — Av The properties of quantities of the form A[]€ will be farther considered in Chapter X, which will be devoted to ensembles of constant energy. It may not be without interest to consider some general formulae relating to averages in a canonical ensemble, which embrace many of the results which have been given in this chapter. Let u be any function of the internal and external coordinates with the momenta and modulus. We have by definition u „ all _ ; = f fue \p—€ • d1n phases If we differentiate with respect to ®, we have du _ Ç Ç /du u \d®~~ ©2 W phases (250) 84 AVERAGE VALUES IN A CANONICAL or du__du d©~ d© ©» + u dij/ © d© Setting u = 1 in this equation, we get d\j/ if/ — e d© ~~ © and substituting this value, we have du _ chi ^ ue ue d© d© + W ~ ©2 (251) or ©2 d% du ue — ue — (w — u) (e — e). (252) If we differentiate equation (250) with respect to a (which may represent any of the external coordinates), and write A for the force — » we Set du da all , v J ’ ‘ ’J \ifa © da © / phases Ip—€ e 0 . .. dqn or d*£ du L w di/f da da © da © Setting -a = 1 in this equation, we get (253) Substituting this value, we have du du uA uA da da © © (254) or © da~~ ® da = ^ ““ ^-4 = (u — w) (-4 — A). (255) Eepeated applications of the principles expressed by equations (252) and (255) are perhaps best made in the particular cases. Yet we may write (252) in this formENSEMBLE OF SYSTEMS. 85 (e + D) (u — u) — 0, (256) where D represents the operator ©2 djd®. Henee (€ + D)h (u-u)= 0, (257) where h is any positive whole number. It will be observed, that since e is not function of ©, (e + JJ)n may be expanded by the binomial theorem. Or, we may write But the operator (e + Z))*, although in some respects more simple than the operator without the average sign on the e, cannot be expanded oy the binomial theorem, since e is a function of © with the external coordinates. So from equation (254) we have The binomial theorem cannot be applied to these operators. Again, if we now distinguish, as usual, the several external coordinates by s affixes, we may apply successively to the expression u — u any or all of the operators (e + D) u = (e -f D) u, (258) whence (e + D)h u = (e + JD>)h u. (259) (260) whence (261) and (262) whence (263)86 AVERAGES IN A CANONICAL ENSEMBLE. as many times as we choose, and in any order, the average value of the result will be zero. Or, if we apply the same operators to w, and finally take the average value, it will be the same as the value obtained by writing the sign of average separately as u, and on e, A±, A2, etc., in all the operators. If u is independent of the momenta, formulae similar to the preceding, but having eq in place of e, may be derived from equation (179).CHAPTER Vili. ON CERTAIN IMPORTANT FUNCTIONS OF THE ENERGIES OF A SYSTEM. Ik order to consider more particularly the distribution of a canonical ensemble in energy, and for other purposes, it will be convenient to use the following definitions and notations. Let us denote by V the extension-in-phase below a certain limit of energy which we shall call e. That is, let the integration being extended (with constant values of the external coordinates) over all phases for which the energy is less than the limit e. We shall suppose that the value of this integral is not infinite, except for an infinite value of the limiting energy. This will not exclude any kind of system to which the canonical distribution is applicable. For if taken without limits has a finite value,* the less value represented by taken below a limiting value of e, and with the e before the integral sign representing that limiting value, will also be finite. Therefore the value of V, which differs only by a constant factor, will also be finite, for finite e. It is a function of e and the external coordinates, a continuous increasing * This is a necessary condition of the canonical distribution. See Chapter IV, p. 35. (265) e88 CERTAIN IMPORTANT FUNCTIONS function of e, which becomes infinite with e, and vanishes for the smallest possible value of e, or for e = — oo, if the energy may be diminished without limit. Let us also set . , dV *=logw (266) The extension in phase between any two limits of energy, d and e", will be represented by the integral £ e* de. (267) And in general, we may substitute e^ de for dp1... dqn in a 2n-fold integral, reducing it to a simple integral, whenever the limits can be expressed by the energy alone, and the other factor under the integral sign is a function of the energy alone, or with quantities which are constant in the integration. In particular we observe that the probability that the energy of an unspecified system of a canonical ensemble lies between the limits e and er will be represented by the integral * r ’+0 de, (268) and that the average value in the ensemble of any quantity which only varies with the energy is given by the equation f +

q dea (275) When Vq is not a continuous function of ee, we may write dVq for ^qdeq in these formulae. In like manner also, for any given configuration, let us denote by Vp the extension-in-velocity below a certain limit of kinetic energy specified by ep. That is, let (276)OF THE ENERGIES OF A SYSTEM. 91 the integration being extended, with constant values of the coordinates, both internal and external, over all values of the momenta for which the kinetic energy is less than the limit ep. Vp will evidently be a continuous increasing function of ep which vanishes and becomes infinite with ep. Let us set = (277) The extension-in-velocity between any two limits of kinetic energy ep and ep may be represented by the integral € v/ J* e^dep. (278) €p' And in general, we may substitute e^p dep for A/ dpx . . . dpn or dq1 . . • dqn in an w-fold integral in which the coordinates are constant, reducing it to a simple integral, when the limits are expressed by the kinetic energy, and the other factor under the integral sign is a function of the kinetic energy, either alone or with quantities which are constant in the integration. It is easy to express Vp and p in terras of ep. Since Ap is function of the coordinates alone, we have by definition Vr = a *f. dpn (279) the limits of the integral being given by ep. That is, if €p = F (Pi, • • . Pri)} (280) the limits of the integral for ep = 1, are given by the equation • * • JPn) = (281) and the limits of the integral for ep = a2, are given by the equation F(p^...pn) = a\ (282) But since F represents a quadratic function, this equation may be written \ a a) (283)92 CERTAIN IMPORTANT FUNCTIONS The value of Vp may also be put in the form (284) Now we may determine Vv for ep = 1 from (279) where the limits are expressed by (281), and Vp for ep = a? from (284) taking the limits from (288). The two integrals thus determined are evidently identical, and we have n L e., Vp varies as ep2. We may therefore set n v,= p-€P r • +v, 1, ±p where e® = (2tt©) 2. Substituting this value, and that of e$* from (286), we get .f? ü_i %°f0e@e/ dep = (2n®f, |cr(ï) = (2»)S (287) r-JM!_. r(i» + i) Having thus determined the value of the constant C, we mayOF THE ENERGIES OF A SYSTEM. 93 substitute it in the general expressions (286), and obtain the following values, which are perfectly general: It will be observed that the values of Vp and p for any given ep are independent of the configuration, and even of the nature of the system considered, except with respect to its number of degrees of freedom, Returning to the canonical ensemble, we may express the probability that the kinetic energy of a system of a given configuration, but otherwise unspecified, falls within given limits, by either member of the following equation Since this value is independent of the coordinates it also represents the probability that the kinetic energy of an unspecified system of a canonical ensemble falls within the limits. The form of the last integral also shows that the probability that the ratio of the kinetic energy to the modulus * Very similar values for Vq, e\ V, and e^ may be found in the same way in the case discussed in the preceding foot-notes (see pages 54, 72, 77, and 79), in which eq is a quadratic function of the q’s, and Aq independent of the In this case we have (2irepf p r(i» + i) (288) n n *(289)94 CERTAIN IMPORTANT FUNCTIONS falls within given limits is independent also of the value of the modulus, being determined entirely by the number of degrees of freedom of the system and the limiting values of the ratio. The average value of any function of the kinetic energy, either for the whole ensemble, or for any particular configuration, is given by Thus: u ©2r(i?i) dep *(291) r(» + i») r(J.) *' if m + in > 0 ; t(292) Kp r Qsn +1) r (|w) ^ ; ’ (293) * The corresponding equation for the average value of any function of the potential energy, when this is a quadratic function of the q* s, and A $ is independent of the q’s, is ®ÏT(ïn)' 'fue w—l (Cq — €a)5 dcq. In the same case, the average value of any function of the (total) energy is given by the equation - 1 f “ ©'■r(n)-' 1 ue ® (.— «a)” lde. Hence in this case /■; f„Vm —r 0. and d_1 di~~Q’ if = 0, if »> 2, if n > 1. If n = 1, = 2 it and d

1 ; e-*» T7^ = ©. « > 2; (294) (295) (296) If n = 2, e*p = 2 7T, and dp/dep = 0, for any value of e^. The definitions of F, F2, and Vp give F =//'"'• (297) where the integrations cover all phases for which the energy is less than the limit e, for which the value of V is sought. This gives F= I VpdVq, (298) v^Lo and '=75=/«*'^" V2=0 (299) where Vp and e^p are connected with Vq by the equation ep + eq = constant = e. (300) If n > 2, vanishes at the upper limit, i. we get by another differentiation e., for ep = 0, and M Jf'/Ajjr de J dep 2 (301) Vq — 0 We may also write 63=6 II V (302) Vq-0 =y \*p+*qdeq, rt=o (303)96 CERTAIN IMPORTANT FUNCTIONS etc., when Vq is a continuous function of eq commencing with the value Vq = 0, or when we choose to attribute to Vq a fictitious continuity commencing with the value zero, as described on page 90. If we substitute in these equations the values of Vp and which we have found, we get r-r<»2+ij<* Vq~Q (304) Vq-0 (305) where deq may be substituted for d Vq in the cases above described. If, therefore, n is known, and Vq as function of eff, V and may be found by quadratures. It appears from these equations that V is always a continuous increasing function of e, commencing with the value V— 0, even when this is not the case with respect to Vq and eq. The same is true of when n > 2, or when n = 2 if ^increases continuously with eq from the value Vq — 0. The last equation may be derived from the preceding by differentiation with respect to e. Successive differentiations give, if h < \ n + 1, dhV_ Pi deh J dV — de> €q = € Vq = 0 T(in + %-h 2 dVq. (306) Va — 0 dhV/ de* is therefore positive if h < J n + 1. It is an increasing function of e, if h < \n. If e is not capable of being diminished without limit, dhVj dd1 vanishes for the least possible value of e, if h < \n. If n is even,OF THE ENERGIES OF A SYSTEM. 97 1 &V That is, Vq is the same function of eq, as ——- —— of e. (2tt)2 de2 When n is large, approximate formulae will be more available. It will be sufficient to indicate the method proposed, without precise discussion of the limits of its applicability or of the degree of its approximation. For the value of e$ corresponding to any given e, we have e* = f-5=o 6 f e*^de„ 0 (308) where the variables are connected by the equation (300). The maximum value of q is therefore characterized by the equation corresponding to any given value of e, when given as function of e for any two systems, we may express by quadratures V and for the system formed by combining the two. If we distinguish by the suffixes ( )x, ( )2, ( )12 the quantities relating to the three systems, we have easily from the definitions of these quantities Via =JJdVrdK =J VadV1=J VxdVa =J V.e^de,, (315) «*“ =Je^dVt =Je^dVa =J /1+^Ve2, (316) where the double integral is to be taken within the limits V1 — 0, V2 — 0, and + e2 = e12, and the variables in the single integrals are connected by the last of these equations, while the limits are given by the first two, which characterize the least possible values of e1 and e2 respectively. It will be observed that these equations are identical in form with those by which Fund are derived from Vp or p and Vq or <^, except that they do not admit in the general case those transformations which result from substituting for Vp or AND THE CANONICAL DISTRIBUTION. Iisr this chapter we shall return to the consideration of the canonical distribution, in order to investigate those properties which are especially related to the function of the energy which we have denoted by <£. If we denote by IV, as usual, the total number of systems in the ensemble, will represent the number having energies between the limits e and e + de. The expression represents what may be called the density-in-energy. This vanishes for e = oo, for otherwise the necessary equation could not be fulfilled. For the same reason the density-in-energy will vanish for e = — oo, if that is a possible value of the energy. Generally, however, the least possible value of the energy will be a finite value, for which, if n > 2, e* will vanish,* and therefore the density-in-energy. Now the density-in-energy is necessarily positive, and since it vanishes for extreme values of the energy if n > 2, it must have a maximum in such cases, in which the energy may be said to have (318) * See page 96.THE FUNCTION 0. 101 its most common or most probable value, and which is determined by the equation d 1 de ~~ © (319) This value of dcfa/de is also, when n > 2, its average value in the ensemble. For we have identically, by integration by parts, € — QO Y v \u—c c — ] +-if v=o v=o v=o \b—€ € — G0 6 = 00 \I/—€ Z-------■* s* i------------------------ +

2, the expression in the brackets, which multiplied by N would represent the density-in-energy, vanishes at the limits, and we have by (269) and (318) dcf) 1 de ~~ ©* (321) It appears, therefore, that for systems of more than two degrees of freedom, the average value of dcfr/de in an ensemble canonically distributed is identical with the value of the same differential coefficient as calculated for the most common energy in the ensemble, both values being reciprocals of the modulus. Hitherto, in our consideration of the quantities V, V# . 4>p’ we have regarded the external coordinates as constant. It is evident, however, from their definitions that V and are in general functions of the external coordinates and the energy (e), that VQ and q are in general functions of the external coordinates and the potential energy (e7). Vv and p we have found to be functions of the kinetic energy (e^) alone. In the equation F=0 by which ^ may be determined, ® and the external coordinates (contained implicitly in <£) are constant in the integration. The equation shows that f is a function of these constants.102 THE FUNCTION AND If their values are varied, we shall have by differentiation, if n > 2, (Since e* vanishes with V, when n > 2, there are no terms due to the variations of the limits.) Hence by (269) * See equations (321) and (104). Suffixes are here added to the differential coefficients, to make the meaning perfectly distinct, although the same quantities may be written elsewhere without the suffixes, when it is believed that there is no danger of misapprehension. The suffixes indicate the quantities which are constant in the differentiation, the single letter a standing for all the letters alt a2, etc., or all except the one which is explicitly varied. F=0 -j- d or, since ^ + € _ - (325) (326) defy Ai d — dd> _ dd> , . dd> , , . —r— 6?€ -f- —— (¿££1 -{- -=— d(l2 “f* 6tC. de dax ijdg (334) For the more precise comparison of these equations, we may suppose that the energy in the last equation is some definite and fairly representative energy in the ensemble. For this purpose we might choose the average energy. It will perhaps be more convenient to choose the most common energy, which we shall denote by e0. The same suffix will be applied to functions of the energy determined for this value. Our identity then becomes d de (336) when n > 2. Moreover, since the external coordinates have constant values throughout the ensemble, the values of d/dav d(f>/da^ etc. vary in the ensemble only on account of the variations of the energy (e), which, as we have seen, may be regarded as sensibly constant throughout the ensemble, when n is very great. In this case, therefore, we may regard the average values 104 THE FUNCTION AND as practically equivalent to the values relating to the most common energy In this case also de is practically equivalent to deQ. We have therefore, for very large values of nr approximately. That is, except for an additive constant, *— 77 may be regarded as practically equivalent to $0, when the number of degrees of freedom of the system is very great. It is not meant by this that the variable part of v + 4>o numerically of a lower order of magnitude than unity. For when n is very great, — rj and 0 are very great, and we can only conclude that the variable part of 77 + <£0 is insignificant compared with the variable part of 77 or of 0, taken separately. Now we have already noticed a certain correspondence between the quantities 0 and 77 and those which in thermodynamics are called temperature and entropy. The property just demonstrated, with those expressed by equation (336), therefore suggests that the quantities <£ and de/dcj) may also correspond to the thermodynamic notions of entropy and temperature. We leave the discussion òf this point to a subsequent chapter, and only mention it here to justify the somewhat detailed investigation of the relations of these quantities. We may get a clearer view of the limiting form of the relations when the number of degrees of freedom is indefinitely increased, if we expand the function in a series arranged according to ascending powers of e — c0. This expansion may be written etc. — dr] ~ d0 (337)THE CANONICAL DISTRIBUTION 105 iff — € __ ÿ -- €0 ^ € — 6q © © © 9 we get by (336) + etc. (339) Substituting this value in e which expresses the probability that the energy of an unspecified system of the ensemble lies between the limits e' and e", we get When the number of degrees of freedom is very great, and e — e0 in consequence very small, we may neglect the higher powers and wnite* This shows that for a very great number of degrees of freedom the probability of deviations of energy from the most probable value (e0) approaches the form expressed by the ‘law of errors.’ With this approximate law, we get * If a higher degree of accuracy is desired than is afforded by this formula, it may be multiplied by the series obtained from by the ordinary formula for the expansion in series of an exponential function. There would be no especial analytical difficulty in taking account of a moderate number of terms of such a series, which would commence (340) e (341) e106 THE FUNCTION AND whence © +o = 4log = = — i log (2 7r (e — e)2). (344) Now it has been proved in Chapter VII that (e — e)2 n den €p We have therefore V + 4o = + 4o=-hl°S (2ir(e - e)2) = - £ log V) V?) approximately. The order of magnitude of y — <£0 is therefore that of log n. This magnitude is mainly constant. The order of magnitude of y + 0 — \ log n is that of unity. The order of magnitude of 0, and therefore of — y, is that of n* Equation (338) gives for the first approximation 5-^2).^^’ = -*- \ de © \de2)0 6 6°^ * Compare (289), (314).THE CANONICAL DISTRIBUTION 107 whence (349) (350) This is of the order of magnitude of n* It should be observed that the approximate distribution of the ensemble in energy according to the 4law of errors’ is not dependent on the particular form of the function of the energy which we have assumed for the index of probability (7]). In any case, we must have € — CO j‘efHde = l, (351) F=0 where is necessarily positive. This requires that it shall vanish for e = , and also for e = — oo , if this is a possible value. It has been shown in the last chapter that if e has a (finite) least possible value (which is the usual case) and n > 2, will vanish for that least value of e. In general therefore 7) + will have a maximum, which determines the most probable value of the energy. If we denote this value by €0, and distinguish the corresponding values of the functions of the energy by the same suffix, we shall have (sHS)=°- ®2> The probability that an unspecified system of the ensemble * We shall find hereafter that the equation (d± \de 1 is exact for any value of n greater than 2, and that the equation Yd(j> 1\2__ \c?e ©/ c?e2 is exact for any value of n greater than 4.108 THE FUNCTION AND falls within any given limits of energy (e' and ef/) sented by is repre- Je^de. If we expand r\ and in ascending powers of e — e0, without going beyond the squares, the probability that the energy falls within the given limits takes the form of the ‘ law of errors ’ — e" [f^L\ 1 (e~eo)2 ¿fio+Vo i e\\dei)o1r\ Jo} 2 (353) €' This gives (354) (355) We shall have a close approximation in general when the quantities equated in (855) are very small, i. e., when -(&).-(3). is very great. Now when n is very great, — dty/de2 is of the same order of magnitude, and the condition that (356) shall be very great does not restrict very much the nature of the function 7). We may obtain other properties pertaining to average values in a canonical ensemble by the method used for the average of d(j>/de. Let u be any function of the energy, either alone or with © and the external coordinates. The average value of u in the ensemble is determined by the equation F=0 ~h

2, and we get as before. It is evident from the same considerations that the second member of (359) will always vanish if n > 2, unless u becomes infinite at one of the limits, in which case a more careful examination of the value of the expression will be necessary. To facilitate the discussion of such cases, it will be convenient to introduce a certain limitation in regard to the nature of the system considered. We have necessarily supposed, in all our treatment of systems canonically distributed, that the system considered was such as to be capable of the canonical distribution with the given value of the modulus. We shall now suppose that the system is such as to be capable of a canonical distribution with any (finite) f modulus. Let us see what cases we exclude by this last limitation. * A more general equation, which is not limited to ensembles canonically distributed, is ____ where 77 denotes, as usual, the index of probability of phase. t The term finite applied to the modulus is intended to exclude the value zero as well as infinity.110 THE FUNCTION <£ AND The impossibility of a canonical distribution occurs when the equation fails to determine a finite value for i/r. Evidently the equation cannot make ^ an infinite positive quantity, the impossibility therefore occurs when the equation makes = — oo . Now we get easily from (191) If the canonical distribution is possible for any values of ®, we can apply this equation so long as the canonical distribution is possible. The equation shows that as © is increased (without becoming infinite) — yfr cannot become infinite unless e simultaneously becomes infinite, and that as © is decreased (without becoming zero) — cannot become infinite unless simultaneously e becomes an infinite negative quantity. The corresponding cases in thermodynamics would be bodies which could absorb or give out an infinite amount of heat without passing certain limits of temperature, when no external work is done in the positive or negative sense. Such infinite values present no analytical difficulties, and do not contradict the general laws of mechanics or of thermodynamics, but they are quite foreign to our ordinary experience of nature. In excluding such cases (which are certainly not entirely devoid of interest) we do not exclude any which are analogous to any actual cases in thermodynamics. We assume then that for any finite value of © the second member of (361) has a finite value. When this condition is fulfilled, the second member of (359) will vanish for u = e~* V. For, if we set ©' = 2©, ¡f, € = 00 € (361) v=o € 0 eTHE CANONICAL DISTRIBUTION 111 where \frf denotes the value of yjr for the modulus ©'. Since the last member of this formula vanishes for e = oo, the less value represented by the first member must also vanish for the same value of e. Therefore the second member of (359), which differs only by a constant factor, vanishes at the upper limit. The case of the lower limit remains to be considered. Now The second member of this formula evidently vanishes for the value of e, which gives V — 0, whether this be finite or negative infinity. Therefore, the second member of (359) vanishes at the lower limit also, and we have de h°- F# = 0, de or e V = ©. (362) This equation, which is subject to no restriction in regard to the value of n, suggests a connection or analogy between the function of the energy of a system which is represented by V and the notion of temperature in thermodynamics. We shall return to this subject in Chapter XIV. If n > 2, the second member of (359) may easily be shown to vanish for any of the following values of u viz.: <£, e®, e, em, where m denotes any positive number. It will also vanish, when n > 4, for u = dcb/de, and when n > 2 h for u = e~$ dh V!deh\ When the second member of (359) vanishes, and n > 2, we may write , (d 1 \ d 2, (363)112 THE FUNCTION AND (e* ' \de ®) de ® de (365) or 2J> !lÈ = '?(Q = - de de ® ’ (3663 - /'d4> 1 \ d e 1 (e~e) V*"’®J = e*“5 = “1* *(367) (em — —. ( d T\ ~1$ em —, J \de ©y de © (368) If n > 4, ( fd$ l\3_(dy 1 _ /dVV ^ de ®) \«Ze ) ®a — \de2/ t (369) If n >2 h, - 1 -dhVd4> —

dh V e de»-1-1 ®6 deh (370) whence - ^ 1 de ~~ ©’ if n > 2, d*4> /¿A*-! ¿e2 + ) ~©2 if n > 4, as already obtained. Also ¿e8 + de* de + \de) ~~©8 if w > 6. (372) * This equation may also be obtained from equations (252) and (321). Compare also equation (349) which was derived by an approximative method, t Compare equation (350), obtained by an approximative method.THE CANONICAL DISTRIBUTION. 113 If Vq is a continuous increasing function of eq, commencing with Vq = 0, the average value in a canonical ensemble of any function of €q, either alone or with the modulus and the external coordinates, is given by equation (275), which is identical with (357) except that e, , and ^ have the suffix ( )g. The equation may be transformed so as to give an equation identical with (359) except for the suffixes. If we add the same suffixes to equation (361), the finite value of its members will determine the possibility of the canonical distribution. From these data, it is easy to derive equations similar to (360), (362)-(372), except that the conditions of their validity must be differently stated. The equation requires only the condition already mentioned with respect to Vg. This equation corresponds to (362), which is subject to no restriction with respect to the value of w. We may observe, however, that V will always satisfy a condition similar to that mentioned with respect to Vr If Vq satisfies the condition mentioned, and a similar condition, i. e., if e^ is a continuous increasing function of eq, commencing with the value e= 0, equations will hold similar to those given for the case when n > 2, viz., similar to (360), (364)-(368). Especially important is cl(f)q 1 deq ~~ ©* If Vq, e$q (or dVqjde¿), cPVq/d€q2 all satisfy similar conditions, we shall have an equation similar to (369), which was subject to the condition n > 4. And if d3Vqldegs also satisfies a similar condition, we shall have an equation similar to (372), for which the condition was n > 6. Finally, if Vq and h successive differential coefficients satisfy conditions of the kind mentioned, we shall have equations like (370) and (371) for which the condition was n > 2 h.114 THE FUNCTION <£. These conditions take the place of those given above relating to n. In fact, we might give conditions relating to the differential coefficients of V, similar to those given relating to the differential coefficients of Vq, instead of the conditions relating to n, for the validity of equations (860), (363)-(372). This would somewhat extend the application of the equations.CHAPTER X. ON A DISTRIBUTION IN PHASE CALLED MICROCANONI-CAL IN WHICH ALL THE SYSTEMS HAVE THE SAME ENERGY. An important case of statistical equilibrium is that in which all systems of the ensemble have the same energy. We may arrive at the notion of a distribution which will satisfy the necessary conditions by the following process. We may suppose that an ensemble is distributed with a uniform den-sity-in-phase between two limiting values of the energy, er and e", and with density zero outside of those limits. Such an ensemble is evidently in statistical equilibrium according to the criterion in Chapter IV, since the density-in-phase may be regarded as a function of the energy. By diminishing the difference of e' and we may diminish the differences of energy in the ensemble. The limit of this process gives us a permanent distribution in which the energy is constant. We should arrive at the same result, if we should make the density any function of the energy between the limits ef and e", and zero outside of those limits. Thus, the limiting distribution obtained from the part of a canonical ensemble between two limits of energy, when the difference of the limiting energies is indefinitely diminished, is independent of the modulus, being determined entirely by the energy, and is identical' with the limiting distribution obtained from a uniform density between limits of energy approaching the same value. We shall call the limiting distribution at which we arrive by this process micro canonical. We shall find however, in certain cases, that for certain values of the energy, viz., for those for which e^ is infinite,116 A PERMANENT DISTRIBUTION IN WHICH this process fails to define a limiting distribution in any such distinct sense as for other values of the energy. The difficulty is not in the process, but in the nature of the case, being entirely analogous to that which we meet when we try to find a canonical distribution in cases when becomes infinite. We have not regarded such cases as affording true examples of the canonical distribution, and we shall not regard the cases in which is infinite as affording true examples of the micro-canonical distribution. We shall in fact find as we go on that in such cases our most important formulae become illusory. The use of formulae relating to a canonical ensemble which contain e^de instead of dpx... dqn, as in the preceding chapters, amounts to the consideration of the ensemble as divided into an infinity of microcanonical elements. From a certain point of view, the microcanonical distribution may seem more simple than the canonical, and it has perhaps been more studied, and been regarded as more closely related to the fundamental notions of thermodynamics. To this last point we shall return in a subsequent chapter. It is sufficient here to remark that analytically the canonical distribution is much more manageable than the microcanonical. We may sometimes avoid difficulties which the microcanonical distribution presents by regarding it as the result of the following process, which involves conceptions less simple but more amenable to analytical treatment. We may suppose an ensemble distributed with a density proportional* to _(g-V)2 e “2 , where go and er are constants, and then diminish indefinitely the value of the constant go. Here the density is nowhere z;ero until we come to the limit, but at the limit it is zero for all energies except e'. We thus avoid the analytical complication of discontinuities in the value of the density, which require the use of integrals with inconvenient limits. In a microcanonical ensemble of systems the energy (e) is constant, but the kinetic energy (ep) and the potential energyALL SYSTEMS HAVE THE SAME ENERGY. 117 (eq) vary in the different systems, subject of course to the condition €p + eq = e = constant. (373) Our first inquiries will relate to the division of energy into these two parts, and to the average values of functions of ep and €r We shall use the notation w|6 to denote an average value in a microcanonical ensemble of energy e. An average value in a canonical ensemble of modulus ®, which has hitherto been denoted by w, we shall in this chapter denote by w|@, to distinguish more c] early the two kinds of averages. The extension-in-phase within any limits which can be given in terms of ep and €q may be expressed in the notations of the preceding chapter by the double integral dVpdVq taken within those limits. If an ensemble of systems is distributed within those limits with a uniform density-in-phase, the average value in the ensemble of any function (u) of the kinetic and potential energies will be expressed by the quotient of integrals JJ udVpdVg 77 dVpdVg Since dVp = e^p dep, and dep = de when eq is constant, the expression may be written 77 uNv dedVG 77 ?pdedV0 To get the average value of u in an ensemble distributed microcanonically with the energy e, we must make the integrations cover the extension-in-phase between the energies e and e + de. This gives118 A PERMANENT DISTRIBUTION IN WHICH But by (299) the value of the integral in the denominator is e$. We have therefore where e$v and Vq are connected by equation (378), and w, if given as function of ep, or of ep and eq, becomes in virtue of the same equation a function of eq alone. We shall assume that e$ has a finite value. If n > 1, it is evident from equation (305) that e$ is an increasing function of e, and therefore cannot be infinite for one value of e without being infinite for all greater values of e, which would make — yfr infinite.* When n > 1, therefore, if we assume that is finite, we only exclude such cases as we found necessary to exclude in the study of the canonical distribution. But when n = 1, cases may occur in which the canonical distribution is perfectly applicable, but in which the formulae for the microcanonical distribution become illusory, for particular values of e, on account of the infinite value of e$. Such failing cases of the microcanonical distribution for particular values of the energy will not prevent us from regarding the canonical ensemble as consisting of an infinity of microcanonical ensembles, f * See equation (322). f An example of the failing case of the microcanonical distribution is afforded by a material point, under the influence of gravity, and constrained to remain in a vertical circle. The failing case occurs when the energy is j ust sufficient to carry the material point to the highest point of the circle. It will be observed that the difficulty is inherent in the nature of the case, and is quite independent of the mathematical formulae. The nature of the difficulty is at once apparent if we try to distribute a finite number of Vq — 0 (374) f3=0ALL SYSTEMS HAVE THE SAME ENERGY. 119 From the last equation, with (298), we get €2=6 0 rPdvq = e~* r. Vq-0 But by equations (288) and (289) Therefore (375) ~*P TT 2 € rp — €«• n (376) TT tt\ ^ —f V — & rp\€ — ~ €p|é. (377) Again, with the aid of equation (301), we get dep e — e 7 Vq-0 dep de (378) if n > 2. Therefore, by (289), “n>2- (379) These results are interesting on account of the relations of the functions e~^ V and to the notion of temperature in thermodynamics, — a subject to which we shall return hereafter. They are particular cases of a general relation easily deduced from equations (306), (374), (288) and (289). We have dhV_ Y*dhVp-T, .. . ^ ! , i de*~J de* dT<’ lf hpdhVp dhV 6 deh ~- n p,&* = 0^, (382) (383) 1 = ^| © ~~ de 0 d&p I ,=(i-i)^ (384) may be regarded as particular cases of the general equation. The last equation is subject to the condition that n > 2. The last two equations give for a canonical ensemble, if n > 2, TO The corresponding equations for a microcanonical ensemble give, if n > 2, * See equation (292).ALL SYSTEMS HAVE THE SAME ENERGY. 121 which shows that d c?log V approaches the value unity when n is very great. If a system consists of two parts, having separate energies, we may obtain equations similar in form to the preceding, which relate to the system as thus divided.* We shall distinguish quantities relating to the parts by letters with suffixes, the same letters without suffixes relating to the whole system. The extension-in-phase of the whole system within any given limits of the energies may be represented by the double integral n n dV1dV2 ip taken within those limits, as appears at once from the definitions of Chapter VIII. In an ensemble distributed with uniform density within those limits, and zero density outside, the average value of any function of e1 and e3 is given by the quotient n n udV1dV2 //• JJ dV.dV, which may also be written f JJu PdedVi IP' de dV2 If we make the limits of integration e and e + de, we get the * If this condition is rigorously fulfilled, the parts will have no influence on each other, and the ensemble formed by distributing the whole micro-canonically is too arbitrary a conception to have a real interest. The principal interest of the equations which we shall obtain will be in cases in which the condition is approximately fulfilled. But for the purposes of a theoretical discussion, it is of course convenient to make such a condition absolute. Compare Chapter IV, pp. 35 ff., where a similar condition is considered in connection with canonical ensembles. t Where the analytical transformations are identical in form with those on the preceding pages, it does not appear necessary to give all the steps with the same detail.122 A PERMANENT DISTRIBUTION IN WHICH average value of u in an ensemble in which the whole system is microcanonieally distributed in phase, viz., e~^Jue^'dVi, (387) where (px and V2 are connected by the equation ex + e2 = constant = e, (388) and u, if given as function of ei, or of e1 and e2, becomes in virtue of the same equation a function of e2 alone.* Thus €2=€ e^rj* = e-*j VidVt, (389) bT t II m fcT It II t (390) This requires a similar relation for canonical averages Again © = e~* F|0 = e~*vU = e“VaU • e2—f ^ =e~* f ^^dVt. a€i e a€i f3=o (391) (392) But if nx > 2, e^1 vanishes for Vx — 0,f and €2=€ €2=€ d * d Pi -jjt *e =aj* iF-=J*re ir- V2=0 f2=o (393) Hence, if nx > 2, and n2 > 2, d dfa dfa d€ d€\ e ¿62 e (394) * In the applications of the equation (387), we cannot obtain all the results corresponding to those which we have obtained from equation (374), because p is a known function of ep, while fa must be treated as an arbitrary function of «!, or nearly so. t See Chapter VIII, equations (305) and (316).ALL SYSTEMS HAVE THE SAME ENERGY. 123 and 1 d dcßi d(j> 2 © de @ dei @ de% ® (395) We have compared certain functions of the energy of the whole system with average values of similar functions of the kinetic energy of the whole system, and with average values of similar functions of the whole energy of a part of the system. We may also compare the same functions with average values of thè kinetic energy of a part of the system. We shall express the total, kinetic, and potential energies of the whole system by e, ep, and eq, and the kinetic energies of the parts by e^ and e2p. These kinetic energies are necessarily separate : we need not make any supposition concerning potential energies. The extension-in-phase within any limits which can be expressed in terms of e?, e^, e2p may be represented in the notations of Chapter VIII by the triple integral taken within those limits. And if an ensemble of systems is distributed with a uniform density within those limits, the average value of any function of eff, e^, e2p will be expressed by the quotient j J j 6^d€dV,pdVq To get the average value of u for a microeanonieal distribution, we must make the limits € and e + de. The denominator in this case becomes de, and we have or JJJue?vdedVipdVq T',—0 c.ip—0 (396)124 A PERMANENT DISTRIBUTION IN WHICH where 4>lp, V2p, and Vq are connected by the equation elp + e2p + eq = constant = e. Accordingly €g—€ e2p~€ €q e~'hp V-iX = e“*jf f VlpdV,pdVg = e~* V, (397) Vq—0 €2p=zQ and we may write *vviX=e = = (398) 7l\ ib2 and —0 0 — 6 V L T^Xle = 6 4,2(1 vj@ = (399) Again, if nx > 2, €g~€ 62p—6 — €g d^ip Vq—0 62p—0 J *&.e+*dr„dvq “ ' *.=0 lp e5=e a (b - fdep -de d . . = * J *^r'=‘ *=*• <40°) F,=0 Hence, if nx > 2, and n2 > 2, dc}> dip de de ip d2p d€t,n = (**!- 1) = (in, - 1) (401) 1 © d(j> de < dfap dei ip =a%-l)v^0=(in2-l)e^0. (402) We cannot apply the methods employed in the preceding pages to the microcanonical averages of the (generalized) forces Av A2, etc., exerted by a system on external bodies, since these quantities are not functions of the energies, either kinetic or potential, of the whole or any part of the system. We may however use the method described on page 116.ALL SYSTEMS HAVE THE SAME ENERGY. 125 Let us imagine an ensemble of systems distributed in phase according to the index of probability c ¡a2 > where er is any constant which is a possible value of the energy, except only the least value which is consistent with the values of the external coordinates, and c and co are other constants. We have therefore or or again From (404) we have £ =/■> ^ all ^ (€-€')2 f"'fe dPl ' ' 'dq" ~1’ (403) phases c »u (.-o2 « “2 dPl... dqn, (404) phases 6 = oo (e-O2 . . C (' o +0 e = / e " de. v=o (405) -A e “2 dpt. . . dq„ phases «=«> . (*-«')2 .. /e — er---------------------5----r*9 2 €—rAl\96 0)2 de, (406) v=o where* A^\€ denotes the average value of At in those systems of the ensemble which have any same energy e. (This is the same thing as the average value of A1 in a microcanoni-cal ensemble of energy e.) The validity of the transformation is evident, if we consider separately the part of each integral which lies between two infinitesimally differing limits of energy. Integrating by parts, we get126 A PERMANENT DISTRIBUTION IN WHICH de~* dcb\ (*-o2 J V—O €=C0 de. (407) Differentiating (405), we get €=00 V—0 (€-€')2 +

2 on account of the factor e^.) We have therefore from these equations 6=00 ■f 6=00 (*-*02 de, v=o That is: the average value in the ensemble of the quantity represented by the principal parenthesis is zero, This mustALL SYSTEMS HAVE THE SAME ENERGY. 127 be true for any value of o>. If we diminish the average value of the parenthesis at the limit when <0 vanishes becomes identical with the value for e —-e'. But this may be any value of the energy, except the least possible. We have therefore dJs<410) unless it be for the least value of the energy consistent with the external coordinates, or for particular values of the external coordinates. But the value of any term of this equation as determined for particular values of the energy and of the external coordinates is not distinguishable from its value as determined for values of the energy and external coordinates indefinitely near those particular values. The equation therefore holds without limitation. Multiplying by we get k + ^]g/^ = de 1 de The integral of this equation is 2^/ = j> defy de d?r (411) dax dax dax de ¿V (412) where Fx is a function of the external coordinates. We have an equation of this form for each of the external coordinates. This gives, with (266), for the complete value of the differential of V dV = de + (e^A[\e — Fx)dax + — F2)da2 + etc., (413) or dV=- e* (de + 2f]6dax + 'Z^\eda2 + etc.) — jF1da1 —* F2da2 — etc. (414) To determine the values of the functions F1, JP2, etc., let us suppose , a2, etc. to vary arbitrarily, while e varies so as always to have the least value consistent with the values of the external coordinates. This will make V = 0, and d V = 0. If n < 2, we shall have also e+ = 0, which will give Fi = 0, F2 = 0, etc. (415)128 THE MICROCANONICAL DISTRIBUTION. The result is the same for any value of n. For in the variations considered the kinetic energy will be constantly zero, and the potential energy will have the least value consistent with the external coordinates. The condition of the least possible potential energy may limit the ensemble at each instant to a single configuration, or it may not do so ; but in any case the values of Ax, etc. will be the same at each instant for all the systems of the ensemble,* and the equation de -J- Ai da-i -J- A2 dci2 "b etc. = 0 will hold for the variations considered. Hence the functions F1, jF2 , etc. vanish in any case, and we have the equation d V= de + e^A^dax + e^A^\€da2 + etc., (416) de + ~A\eda1 + AJ eda2 + etc. or dl ogF=-----------J€ ^----------, (417) e 9 V or again de — e~^ Vdlog V — €dax — A2\€da2 — etc. (418) It will be observed that the two last equations have the form of the fundamental differential equations of thermodynamics, e~6 V corresponding to temperature and log V to entropy. We have already observed properties of e~^V suggestive of an analogy with temperature, f The significance of these facts will be discussed in another chapter. The two last equations might be written more simply de -j- Ai\e dcii + A2\j'...j'rje'dp!... dqn, (419) phases phases where 77 is a function of the energy, and Arj a function of the phase, which are subject to the conditions that all all S■■■ ■ dpi • • •dqn = § • -fev dpi • • • d%n -(42°) phases phases and that for any value of the energy (e') €=€/+^€/ €=e/-fc?e' S "fdpi ■ * ■dqn =S' ■ I^di>x ■ • ■dq"' (421) t—e' €—€/ Equation (420) expresses the general relations which 77 and 77 + A77 must satisfy in order to be indices of any distributions, and (421) expresses the condition that they give the same distribution in energy.130 MAXIMUM AND MINIMUM PROPERTIES. Since 7] is a function of the energy, and may therefore be regarded as a constant within the limits of integration of (421), we may multiply by 7} under the integral sign in both members, which gives e=e'+de' €=e/Jt-de' j*. . .Jr1el+&Vdp1 . . .dqn = f - • -J*yt'dPi • • • <*?»• €=€/ €=e' Since this is true within the limits indicated, and for every value of e', it will be true if the integrals are taken for all phases. We may therefore cancel the corresponding parts of (419), which gives all j..-JArie'+^dfr ...dqn>0. (422) phases But by (420) this is equivalent to all f f + 1- eAv) evdPl... dqn > 0. (423) phases Now A7} + 1 — is a decreasing function of At} for nega- tive values of A^, and an increasing function of Atj for positive values of Atj. It vanishes for Arj = 0. The expression is therefore incapable of a negative value, and can have the value 0 only for Atj = 0. The inequality (423) will hold therefore unless At) = 0 for all phases. The theorem is therefore proved. Theorem II. If an ensemble of systems is canonically distributed in phase, the average index of probability is less than in any other distribution of the ensemble having the same average energy. For the canonical distribution let the index be (i/r — e) / ©, and for another having the same average energy let the index be (yfr — e) /® + A17, where A7} is an arbitrary function of the phase subject only to the limitation involved in the notion of the index, thatMAXIMUM AND MINIMUM PROPERTIES. 131 all /•••/« if/—6 ^ ^ all if,—e e@ ” dpt. . .dqn=J'. ..Je ® dpx . . . dqn = 1, phases phases and to that relating to the constant average energy, that (424) all ¿—e all dpi. . . dqn=J’. . ^ J e e ® dpx. . . dqn. (425) phases phases It is to be proved that /• • •/(! - 5+A,?)e 0 • • • ^» > phases aU if/—£ /•••/(i-sp"-•<**- <426> phases Now in virtue of the first condition (424) we may cancel the constant term / ® in the parentheses in (426), and in virtue of the second condition (425) we may cancel the term e/©. The proposition to be proved is thus reduced to all f-.h -4 A77 e ¿Pi • • ■ dqn > 0, phases which may be written, in virtue of the condition (424), all if/—c J'- • •J' (Al?+ 1 — 0. phases (427) In this form its truth is evident for the same reasons which applied to (428). Theorem III. If © is any positive constant, the average value in an ensemble of the expression ?; e / © (?? denoting as usual the index of probability and e the energy) is less when the ensemble is distributed canonically with modulus ©, than for any other distribution whatever. In accordance with our usual notation let us write (yfr — e)/© for the index of the canonical distribution. In any other distribution let the index be (yfr — e)/@ 4- A??.132 MAXIMUM AND MINIMUM PROPERTIES. In the canonical ensemble rj + e j 0 has the constant value yjr / in the other ensemble it has the value yfr / © + A77. The proposition to be proved may therefore be written In virtue of this condition, since yjr / @ is constant, the proposition to be proved reduces to where the demonstration may be concluded as in the last theorem. If we should substitute for the energy in the preceding theorems any other function of the phase, the theorems, mn-tatis mutandis, would still hold. On account of the unique importance of the energy as a function of the phase, the theorems as given are especially worthy of notice. When the case is such that other functions of the phase have important properties relating to statistical equilibrium, as described in Chapter IV,* the three following theorems, which are generalizations of the preceding, may be useful. It will be sufficient to give them without demonstration, as the principles involved are in no respect different. Theorem IV. If an ensemble of systems is so distributed in phase that the index of probability is any function of Fv Fv etc., (these letters denoting functions of the phase,) the average value of the index is less than for any other distribution in phase in which the distribution with respect to the functions etc. is unchanged. phases phases phases (430) phases * See pages 37-41.MAXIMUM AND MINIMUM PROPERTIES. 133 Theorem V. If an ensemble of systems is so distributed in phase that the index of probability is a linear function of Fv Fv etc., (these letters denoting functions of the phase,) the average value of the index is less than for any other distribution in which the functions Fv Fv etc. have the same average values. Theorem VI. The average value in an ensemble of systems of 7} + F (where r\ denotes as usual the index of probability and F any function of the phase) is less when the ensemble is so distributed that rj + F is constant than for any other distribution whatever. Theorem VII. If a system which in its different phases constitutes an ensemble consists of two parts, and we consider the average index of probability for the whole system, and also the average indices for each of the parts taken separately, the sum of the average indices for the parts will be either less than the average index for the whole system, or equal to it, but cannot be greater. The limiting case of equality occurs when the distribution in phase of each part is independent of that of the other, and only in this case. Let the coordinates and momenta of the whole system be Si •••?»> .Pi of which qx. . . qm px,. . .pm relate to one part of the system, and qm+1 ,. . .pn to the other. If the index of probability for the whole system is denoted by 77, the probability that the phase of an unspecified system lies within any given limits is expressed by the integral where the integrations cover all phases of the second system, and (431) taken for those limits. If we set (432) (433)184 MAXIMUM AND MINIMUM PROPERTIES. where the integrations cover all phases of the first system, the integral (481) will reduce to the form J. . .Je11 dpx. . . dpmdqx dim (434) when the limits can be expressed in terms of the coordinates and momenta of the first part of the system. The same integral will reduce to s-s* dpm+1 • • • dpn dlm+i • • • dqfn, (435) when the limits can be expressed in terms of the coordinates and momenta of the second part of the system. It is evident that 7]1 and are the indices of probability for the two parts of the system taken separately. The main proposition to be proved may be written S-h eVl dpi . . . dqm +J. . .Jrj^ dpm+1 . . -dqn< J*• • • ^& dpi • • . dqnj (436) where the first integral is to be taken overall phases of the first part of the system, the second integral over all phases of the second part of the system, and the last integral over all phases of the whole system. Now we have §• . .JV dpx . . . dq„ = 1, (437) ^^ 1dpx • • * = 1) (438) and J- 'J•. • dq% = 1, (439) where the limits cover in each case all the phases to which the variables relate. The two last equations, which are in themselves evident, may be derived by partial integration from the first.MAXIMUM AND MINIMUM PROPERTIES. 135 It appears from the definitions of 7)l and i?2 that (436) may also be written /■ • •/* e'dpi. .. dqn +/• • S71* e'ldpl J.. . J'vel dp^-.dq^, (440) or J'. — Vi — Vt) eV tfpi... dqn>0, where the integrations cover all phases. Adding the equation f...j'eh+r>'dp1...dqn = l, (441) which we get by multiplying (438) and (439), and subtracting (437), we have for the proposition to be proved all /• • -f[(v -VI- yt) e” + <7l+" - e”] dp 1. .. dqn > 0. (442) phases Let u = V — Vi — Vi • (443) Thé main proposition to be proved may be written all j'...f(ue* + 1 - e”) e%+% dPl... dqn > 0. (444) phases This is evidently true since the quantity in the parenthesis is incapable of a negative value.* Moreover the sign = can hold only when the quantity in the parenthesis vanishes for all phases, i. e., when u — 0 for all phases. This makes v=zVl + rj2 for all phases, which is the analytical condition which expresses that the distributions in phase of the two parts of the system are independent. Theorem VIIL If two or more ensembles of systems which are identical in nature, but may be distributed differently in phase, are united to form a single ensemble, so that the probability-coefficient of the resulting ensemble is a linear function * See Theorem I, where this is proved of a similar expression.136 MAXIMUM AND MINIMUM PROPERTIES. of the probability-coefficients of the original ensembles, the average index of probability of the resulting ensemble cannot be greater than the same linear function of the average indices of the original ensembles. It can be equal to it only when the original ensembles are similarly distributed in phase. Let Px, -P2, etc. be the probability-coefficients of the original ensembles, and P that of the ensemble formed by combining them; and let -ZV^, etc. be the numbers of systems in the original ensembles. It is evident that we shall have P =z c1P1 + c2P2 + etc. = SfaPi), (445) _ Ni N2 ,.. „ where cx = c2 = —^9 etc. (446) The main proposition to be proved is that all all /-/-P log Pdp1 ...dqn s 2 piJ- • fPi log Pl dpi... dqn phases L. phases (447) all nr (fhPi log Px) - p log P] dPl... dqn > 0. (448) phases If we set QL = Pl log Pi ~ Px log P ~ Px + P Qx will be positive, except when it vanishes for P2 = P. To prove this, we may regard Px and P as any positive quantities. Then (aS),=losP,-1°s'p’ r*Qi\ _ wri»' Since Qx and dQljdPl vanish for Pt = P, and the second differential coefficient is always positive, Qx must be positive except when Px = P. Therefore, if Q.x, etc. have similar definitions, S (Ox QO > o. (449)MAXIMUM AND MINIMUM PROPERTIES. 137 But since 2 (cx Fx) = P and 2 Ci = 1, 2 ( J.. .fve?dPl... dqa, (452) or, since tj is constant, J*- ' 'S^V + • -J'ydpi. .. dqn. (453) In (451) also we may cancel the constant factor eand multiply by the constant factor (tj -f 1). This gives J- • • + 1) #1 . . .dqn=J^. ..J (rj + 1) , . dqn. The subtraction of this equation will not alter the inequality to be proved, which may therefore be written J. . . J(Ar) — 1) dPx . . . dqn >J - ■ J- dPl ...dqn138 MAXIMUM AND MINIMUM PROPERTIES. Since the parenthesis in this expression represents a positive value, except when it vanishes for At; = 0, the integral will be positive unless At; vanishes everywhere within the limits, which would make the difference of the two distributions vanish. The theorem is therefore proved. or (454)CHAPTER XIL ON THE MOTION OF SYSTEMS AND ENSEMBLES OF SYSTEMS THROUGH LONG PERIODS OF TIME. An important question which suggests itself in regard to any case of dynamical motion is whether the system considered will return in thè course of time to its initial phase, or, if it will not return exactly to that phase, whether it will do so to any required degree of approximation in the course of a sufficiently long time. To be able to give even a partial answer to such questions, we must know something in regard to the dynamical nature of the system. In the following theorem, the only assumption in this respect is such as we have found necessary for the existence of the canonical distribution. If we imagine an ensemble of identical systems to be distributed with a uniform density throughout any finite extension-in-phase, the number of the systems which leave the extension-in-phase and will not return to it in the course of time is less than any assignable fraction of the whole number; provided, that the total extension-in-phase for the systems considered between two limiting values of the energy is finite, these limiting values being less and greater respectively than any of the energies of the first-mentioned extension-in-phase. To prove this, we observe that at the moment which we call initial the systems occupy the given extension-in-phase. It is evident that some systems must leave the extension immediately, unless all remain in it forever. Those systems which leave the extension at the first instant, we shall cal] the front of the ensemble. It will be convenient to speak of this front as generating the extension-in-phase through which it passes in the course of time, as in geometry a surface is said to140 MOTION OF SYSTEMS AND ENSEMBLES generate the volume through which it passes. In equal times the front generates equal extensions in phase. This is an immediate consequence of the principle of conservation of extension-in-phase, unless indeed we prefer to consider it as a slight variation in the expression of that principle. For in two equal short intervals of time let the extensions generated be A and B. (We make the intervals short simply to avoid the complications in the enunciation or interpretation of the principle which would arise when the same extension-in-phase is generated more than once in the interval considered.) Now if we imagine that at a given instant systems are distributed throughout the extension A, it is evident that the same systems will after a certain time occupy the extension B, which is therefore equal to A in virtue of the principle cited. The front of the ensemble, therefore, goes on generating equal extensions in equal times. But these extensions are included in a finite extension, viz., that bounded by certain limiting values of the energy. Sooner or later, therefore, the front must generate phases which it has before generated. Such second generation of the same phases must commence with the initial phases. Therefore a portion at least of the front must return to the original extension-in-phase. The same is of course true of the portion of the ensemble which follows that portion of the front through the same phases at a later time. It remains to consider how large the portion of the ensemble is, which will return to the original extension-in-phase. There can be no portion of the given extension-in-phase, the systems of which leave the extension and do not return. For we can prove for any portion of the extension as for the whole, that at least a portion of the systems leaving it will return. We may divide the given extension-in-phase into parts as follows. There may be parts such that the systems within them will never pass out of them. These parts may indeed constitute the whole of the given extension. But if the given extension is very small, these parts will in general be nonexistent. There may be parts such that systems within themTHROUGH LONG PERIODS OF TIME. 141 will all pass out of tlie given extension and all return within it. The whole of the given extension-in-phase is made up of parts of these two kinds. This does not exclude the possibility of phases on the boundaries of such parts, such that systems starting with those phases would leave the extension and never return. But in the supposed distribution of an ensemble of systems with a uniform density-in-phase, such systems would not constitute any assignable fraction of the whole number. These distinctions may be illustrated by a very simple example. If we consider the motion of a rigid body of which one point is fixed, and which is subject to no forces, we find three cases. (1) The motion is periodic. (2) The system will never return to its original phase, but will return infinitely near to it. (3) The system will never return either exactly or approximately to its original phase. But if we consider any extension-in-phase, however small, a system leaving that extension will return to it except in the case called by Poinsot ‘ singular,’ viz., when the motion is a rotation about an axis lying in one of two planes having a fixed position relative to the rigid body. But all such phases do not constitute any true extension-in-phase in the sense in which we have defined and used the term.* In the same way it may be proved that the systems in a canonical ensemble which at a given instant are contained within any finite extension-in-phase will in general return to * An ensemble of systems distributed in phase is a less simple and elementary conception than a single system. But by the consideration of suitable ensembles instead of single systems, we may get rid of the inconvenience of having to consider exceptions formed by particular cases of the integral equations of motion, these cases simply disappearing when the ensemble is substituted for the single system as a subject of study. This is especially true when the ensemble is distributed, as in the case called canonical, throughout an extension-in-phase. In a less degree it is true of the microcanonicai ensemble, which does not occupy any extension-in-phase, (in the sense in which we have used the term,) although it is convenient to regard it as a limiting case with respect to ensembles which do, as we thus gain for the subject some part of the analytical simplicity which belongs to the theory of ensembles which occupy true extensions-in-phase.142 MOTION OF SYSTEMS AND ENSEMBLES that extension-in-phase, if they leave it, the exceptions, i. e., the number which pass out of the extension-in-phase and do not return to it, being less than any assignable fraction of the whole number. In other words, the probability that a system taken at random from the part of a canonical ensemble which is contained within aUy given extension-in-phase, will pass out of that extension and not return to it, is zero. A similar theorem may be enunciated with respect to a microcanonical ensemble. Let us consider the fractional part of such an ensemble which lies within any given limits of phase. This fraction we shall denote by F. It is evidently constant in time since the ensemble is in statistical equilibrium. The systems within the limits will not in general remain the same, but some will pass out in each unit of time while an equal number come in. Some may pass out never to return within the limits. But the number which in any time however long pass out of the limits never to return will not bear any finite ratio to the number within the limits at a given instant. For, if it were otherwise, let f denote the fraction representing such ratio for the time T. Then, in the time T, the number which pass out never to return will bear the ratio f F to the whole number in the ensemble, and in a time exceeding T/(fF) the number which pass out of the limits never to return would exceed the total number of systems in the ensemble. The proposition is therefore proved. This proof will apply to the cases before considered, and may be regarded as more simple than that which was given. It may also be applied to any true case of statistical equilibrium. By a true case of statistical equilibrium is meant such as may be described by giving the general value of the probability that an unspecified system of the ensemble is contained within any given limits of phase.* * An ensemble in which the systems are material points constrained to move in vertical circles, with just enough energy to carry them to the highest points, cannot afford a true example of statistical equilibrium. For any other value of the energy than the critical value mentioned, we mightTHROUGH LONG PERIODS OF TIME. 143 Let ns next consider whether an ensemble of isolated systems has any tendency in the cours: of time toward a state of statistical equilibrium. There are certain functions of phase which are constant in time. The distribution of the ensemble with respect to the values of these functions is necessarily invariable, that is, the number of systems within any limits which can be specified in terms of these functions cannot vary in the course of time. The distribution in phase which without violating this condition gives the least value of the average index of probability of phase (jf) is unique, and is that in which the in various ways describe an ensemble in statistical equilibrium, while the same language applied to the critical value of the energy would fail to do so. Thus, if we should say that the ensemble is so distributed that the probability that a system is in any given part of the circle is proportioned to the time which a single system spends in that part, motion in either direction being equally probable, we should perfectly define a distribution in statistical equilibrium for any value of the energy except the critical value mentioned above, but for this value of the energy all the probabilities in question would vanish unless the highest point is included in the part of the * circle considered, in which case the probability is unity, or forms one of its limits, in which case the probability is indeterminate. Compare the foot-note on page 118. A still more simple example is afforded by the uniform motion of a material point in a straight line. Here the impossibility of statistical equilibrium is not limited to any particular energy, and the canonical distribution as well as the microcanonical is impossible. These examples are mentioned here in order to show the necessity of caution in the application of the above principle, with respect to the question whether we have to do with a true case of statistical equilibrium. Another point in respect to which caution must be exercised is that the part of an ensemble of which the theorem of the return of systems is asserted should be entirely defined by limits within which it is contained, and not by any such condition as that a certain function of phase shall have a given value. This is necessary in order that the part of the ensemble which is considered should be any assignable fraction of the whole. Thus, if we have a canonical ensemble consisting of material points in vertical circles, the theorem of the return of systems may be applied to a part of the ensemble defined as contained in a given part of the circle. But it may not be applied in all cases to a part of the ensemble defined as contained in a given part of the circle and having a given energy. It would, in fact, express the exact opposite of the truth when the given energy is the critical value mentioned above.144 MOTION OF SYSTEMS AND ENSEMBLES index of probability (17) is a function of the functions mentioned.* It is therefore a permanent distribution,! and the only permanent distribution consistent with the invariability of the distribution with respect to the functions of phase which are constant in time. It would seem, therefore, that we might find a sort of measure of the deviation of an ensemble from statistical equilibrium in the excess of the average index above the minimum which is consistent with the condition of the invariability of the distribution with respect to the constant functions of phase. But we have seen that the index of probability is constant in time for each system of the ensemble. The average index is therefore constant, and we find by this method no approach toward statistical equilibrium in the course of time. Yet we must here exercise great caution. One function may approach indefinitely near to another function, while some quantity determined by the first does not approach the corresponding quantity determined by the second. A line joining two points may approach indefinitely near to the straight line joining them, while its length remains constant. We may find a closer analogy with the case under consideration in the effect of stirring an incompressible liquid.! In space of 2 n dimensions the case might be made analytically identical with that of an ensemble of systems of n degrees of freedom, but the analogy is perfect in ordinary space. Let us suppose the liquid to contain a certain amount of coloring matter which does not affect its hydrodynamic properties. Now the state in which the density of the coloring matter is uniform, i. e., the state of perfect mixture, which is a sort of state of equilibrium in this respect that the distribution of the coloring matter in space is not affected by the internal motions of the liquid, is characterized by a minimum * See Chapter XI, Theorem IY. f See Chapter IV, sub init. % By liquid is here meant the continuous body of theoretical hydrodynamics, and not anything of the molecular structure and molecular motions of real liquids.THROUGH LONG PERIODS OF TIME. 145 value of the average square of the density of the coloring matter. Let us suppose, however, that the coloring matter is distributed with a variable density. If we give the liquid any motion whatever, subject only to the hydrodynamic law of incompressibility, — it may be a steady flux, or it may vary with the time,—the density of the coloring matter at any same point of the liquid will be unchanged, and the average square of this density will therefore be unchanged. Yet no fact is more familiar to us than that stirring tends to bring a liquid to a state of uniform mixture, or uniform densities of its components, which is characterized by minimum values of the average squares of these densities. It is quite true that in the physical experiment the result is hastened by the process of diffusion, but the result is evidently not dependent on that process. The contradiction is to be traced to the notion of the density of the coloring matter, and the process by which this quantity is evaluated. This quantity is the limiting ratio of the quantity of the coloring matter in an element of space to the volume of that element. Now if wTe should take for our elements of volume, after any amount of stirring, the spaces occupied by the same portions of the liquid which originally occupied any given system of elements of volume, the densities of the coloring matter, thus estimated, would be identical with the original densities as determined by the given system of elements of volume. Moreover, if at the end of any finite amount of stirring we should take our elements of volume in any ordinary form but sufficiently small, the average square of the density of the coloring matter, as determined by such element of volume, would approximate to any required degree to its value before the stirring. But if we take any element of space of fixed position and dimensions, we may continue the stirring so long that the densities of the colored liquid estimated for these fixed elements will approach a uniform limit, viz., that of perfect mixture. The case is evidently one of those in which the limit of a limit has different values, according to the order in which we146 MOTION OF SYSTEMS AND ENSEMBLES apply tlie processes of taking a limit. If treating the elements of volume as constant, we continue the stirring indefinitely, we get a uniform density, a result not affected by making the elements as small as we choose; but if treating the amount of stirring as finite, we diminish indefinitely the elements of volume, we get exactly the same distribution in density as before the stirring, a result which is not affected by continuing the stirring as long as we choose. The question is largely one of language and definition. One may perhaps be allowed to say that a finite amount of stirring will not affect the mean square of the density of the coloring matter, but an infinite amount of stirring may be regarded as producing a condition in which the mean square of the density has its minimum value, and the density is uniform. We may certainly say that a sensibly uniform density of the colored component may be produced by stirring. Whether the time required for this result would be long or short depends upon the nature of the motion given to the liquid, and the fineness of our method of evaluating the density. All this may appear more distinctly if we consider a special case of liquid motion. Let us imagine a cylindrical mass of liquid of which one sector of 90° is black and the rest white. Let it have a motion of rotation about the axis of the cylinder in which the angular velocity is a function of the distance from the axis. In the course of time the black and the white parts would become drawn out into thin ribbons, which would be wound spirally about the axis. The thickness of these ribbons would diminish without limit, and the liquid would therefore tend toward a state of perfect mixture of the black and white portions. That is, in any given element of space, the proportion of the black and white would approach 1: 3 as a limit. Yet after any finite time, the total volume would be divided into two parts, one of which would consist of the white liquid exclusively, and the other of the black exclusively. If the coloring matter, instead of being distributed initially with a uniform density throughout a section of the cylinder, were distributed with a density represented by any arbitrary func-THROUGH LONG PERIODS OF TIME. 147 tion of the cylindrical coordinates r, 6 and 2, the effect of the same motion continued indefinitely would be an approach to a condition in which the density is a function of r and z alone. In this limiting condition, the average square of the density would be less than in the original condition, when the density was supposed to vary with £, although after any finite time the average square of the density would be the same as at first. If we limit our attention to the motion in a single plane perpendicular to the axis of the cylinder, we have something which is almost identical with a diagrammatic representation of the changes in distribution in phase of an ensemble of systems of one degree of freedom, in which the motion is periodic, the period varying with the energy, as in the case of a pendulum swinging in a circular arc. If the coordinates and momenta of the systems are represented by rectangular coordinates in the diagram, the points in the diagram representing the changing phases of moving systems, will move about the origin in closed curves of constant energy. The motion will be such that areas bounded by points representing moving systems will be preserved. The only difference between the motion of the liquid and the motion in the diagram is that in one case the paths are circular, and in the other they differ more or less from that form. When the energy is proportional to p2 + o its possible value. Now if the case were one of statistical equilibrium, the value of rj would be constant in any path, and if all the paths which pass through D V11 also pass through or near D V', the value of rj throughout D V11 will vary little from 7)f. But when the case is not one of statistical equilibrium, we cannot draw any such conclusion. The only conclusion which we can draw with respect to the phase at t1 of the systems which at t11 are in DV11 is that they are nearly on the same path. Now if we should make a new estimate of indices of probability of phase at the time t11, using for this purpose the elements DV, — that is, if we should divide the number of150 MOTION OF SYSTEMS AND ENSEMBLES systems in D Vn, for example, by the total number of systeins, and also by the extension-in-phase of the element, and take the logarithm of the quotient, we would get a number which would be less than the average value of rj for the systems within jD Vn based on the distribution in phase at the time £'.* Hence the average value of rj for the whole ensemble of systems based on the distribution at tn will be less than the average value based on the distribution at tf. We must not forget that there are exceptions to this general rule. These exceptions are in cases in which the laws of motion are such that systems having small differences of phase will continue always to have small differences of phase. It is to be observed that if the average index of probability in an ensemble may be said in some sense to have a less value at one time than at another, it is not necessarily priority in time which determines the greater average index. If a distribution, which is not one of statistical equilibrium, should be given for a time and the distribution at an earlier time t,f should be defined as that given by the corresponding phases, if we increase the interval leaving t! fixed and taking tn at an earlier and earlier date, the distribution at tn will in general approach a limiting distribution which is in statistical equilibrium. The determining difference in such cases is that between a definite distribution at a definite time and the limit of a varying distribution when the moment considered is carried either forward or backward indefinitely, f But while the distinction of prior arid subsequent events may be immaterial with respect to mathematical fictions, it is quite otherwise with respect to the events of the real world. It should not be forgotten, when our ensembles are chosen to illustrate the probabilities of events in the real world, that * See Chapter XI, Theorem IX. t One may compare the kinematical truism that when two points are moving with uniform velocities, (with the single exception of the case where the relative motion is zero,) their mutual distance at any definite time is less than for * = oo, or £ = — oo.THROUGH LONG PERIODS OF TIME. *151 while the probabilities of subsequent events may often be determined from the probabilities of prior events, it is rarely the case that probabilities of prior events can be determined from those of subsequent events, for we are rarely justified in excluding the consideration of the antecedent probability of the prior events. It is worthy of notice that to take a system at random from an ensemble at a date chosen at random from several given dates, tf, tn9 etc., is practically the same thing as to take a system at random from the ensemble composed of all the systems of the given ensemble in their phases at the time t\ together with the same systems in their phases at the time trf, etc. By Theorem VIII of Chapter XI this will give an ensemble in which the average index of probability will be less than in the given ensemble, except in the case when the distribution in the given ensemble is the same at the times etc. Consequently, any indefiniteness in the time in which we take a system at random from an ensemble has the practical effect of diminishing the average index of the ensemble from which the system may be supposed to be drawn, except when the given ensemble is in statistical equilibrium.CHAPTER XIII. EFFECT OF VARIOUS PROCESSES ON AN ENSEMBLE OF SYSTEMS. In the last chapter and in Chapter I we have considered the changes which take place in the course of time in an ensemble of isolated systems. Let us now proceed to consider the changes which will take place in an ensemble of systems under external influences. These external influences will be of two kinds, the variation of the coordinates which we have called external, and the action of other ensembles of systems. The essential difference of the two kinds of influence consists in this, that the bodies to which the external coordinates relate are not distributed in phase, while in the case of interaction of the systems of two ensembles, we have to regard the fact that both are distributed in phase. To find the effect produced on the ensemble with which we are principally concerned, we have therefore to consider single values of what we have called external coordinates, but an infinity of values of the internal coordinates of any other ensemble with which there is interaction. Or, — to regard the subject from another point of view, — the action between an unspecified system of an ensemble and the bodies represented by the external coordinates, is the action between a system imperfectly determined with respect to phase and one which is perfectly determined; while the interaction between two unspecified systems belonging to different ensembles is the action between two systems both of which are imperfectly determined with respect to phase.* We shall suppose the ensembles which we consider to be distributed in phase in the manner described in Chapter I, and * In the development of the subject, we shall find that this distinction corresponds to the distinction in thermodynamics between mechanical and thermal action.EFFECT OF VARIOUS PROCESSES. 153 represented by the notations of that chapter, especially by the index of probability of phase (rf). There are therefore 2 n independent variations in the phases which constitute the ensembles considered. This excludes ensembles like the microcanonical, in which, as energy is constant, there are only 2 7i — 1 independent variations of phase. This seems necessary for the purposes of a general discussion. For although we may imagine a microcanonical ensemble to have a permanent existence when isolated from external influences, the effect of such influences would generally be to destroy the uniformity of energy in the ensemble. Moreover, since the microcanonical ensemble may be regarded as a limiting case of such ensembles as are described in Chapter I, (and that in more than one way, as shown in Chapter X,) the exclusion is rather formal than real, since any properties which belong to the microcanonical ensemble could easily be derived from those of the ensembles of Chapter I, which in a certain sense may be regarded as representing the general case. Let us first consider the effect of variation of the external coordinates. We have already had occasion to regard these quantities as variable in the differentiation of certain equations relating to ensembles distributed according to certain laws called canonical or microcanonical. That variation of the external coordinates was, however, only carrying the attention of the mind from an ensemble with certain values of the external coordinates, and distributed in phase according to some general law depending upon those values, to another ensemble with different values of the external coordinates, and with the distribution changed to conform to these new values. What we have now to consider is the effect which would actually result in the course of time in an ensemble of systems in which the external coordinates should be varied in any arbitrary manner. Let us suppose, in the first place, that these coordinates are varied abruptly at a given instant, being constant both before and after that instant. By the definition of the external coordinates it appears that this variation does not affect the phase of any system of the ensemble at the time154 EFFECT OF VARIOUS PROCESSES when it takes place. Therefore it does not affect the index of probability of phase (77) of any system, or the average value of the index (77) at that time. And if these quantities are constant in time before the variation of the external coordinates, and after that variation, their constancy in time is not interrupted by that variation. In fact, in the demonstration of the conservation of probability of phase in Chapter I, the variation of the external coordinates was not excluded. But a variation of the external coordinates will in general disturb a previously existing state of statistical equilibrium. For, although it does not affect (at the first instant) the distribution-in-phase, it does affect the condition necessary for equilibrium. This condition, as we have seen in Chapter IY, is that the index of probability of phase shall be a function of phase which is constant in time for moving systems. Now a change in the external coordinates, by changing the forces which act on the systems, will change the nature of the functions of phase which are constant in time. Therefore, the distribution in phase which was one of statistical equilibrium for the old values of the external coordinates, will not be such for the new values. Now we have seen, in the last chapter, that when the distribution-in-phase is not one of statistical equilibrium, an ensemble of systems may, and in general will, after a longer or shorter time, come to a state which may be regarded, if very small differences of phase are neglected, as one of statistical equilibrium, and in which consequently the average value of the index (77) is less than at first. It is evident, therefore, that a variation of the external coordinates, by disturbing a state of statistical equilibrium, may indirectly cause a diminution, (in a certain sense at least,) of the value of 7?. But if the change in the external coordinates is very small, the change in the distribution necessary for equilibrium will in general be correspondingly small. Hence, the original distribution in phase, since it differs little from one which would be in statistical equilibrium with the new values of the external coordinates, may be supposed to have a value of vON AN ENSEMBLE OF SYSTEMS. 155 which differs by a small quantity of the second order from the minimum value which characterizes the state of statistical equilibrium. And the diminution in the average index resulting in the course of time from the very small change in the external coordinates, cannot exceed this small quantity of the second order. Hence also, if the change in the external coordinates of an ensemble initially in statistical equilibrium consists in successive very small changes separated by very long intervals of time in which the disturbance of statistical equilibrium becomes sensibly effaced, the final diminution in the average index of probability will in general be negligible, although the total change in the external coordinates is large. The result will be the same if the change in the external coordinates takes place continuously but sufficiently slowly. Even in cases in which there is no tendency toward the restoration of statistical equilibrium in the lapse of time, a variation of external coordinates which would cause, if it took place in a short time, a great disturbance of a previous state of equilibrium, may, if sufficiently distributed in time, produce no sensible disturbance of the statistical equilibrium. Thus, in the case of three degrees of freedom, let the systems be heavy points suspended by elastic massless cords, and let the ensemble be distributed in phase with a density proportioned to some function of the energy, and therefore in statistical equilibrium. For a change in the external coordinates, we may take a horizontal motion of the point of suspension. If this is moved a given distance, the resulting disturbance of the statistical equilibrium may evidently be diminished indefinitely by diminishing the velocity of the point of suspension. This will be true if the law of elasticity of the string is such that the period of vibration is independent of the energy, in which case there is no tendency in the course of time toward a state of statistical equilibrium, as well as in the more general case, in which there is a tendency toward statistical equilibrium. That something of this kind will be true in general, the following considerations will tend to show.156 EFFECT OF VARIOUS PROCESSES We define a path as the series of phases through which a system passes in the course of time when the external coordinates have fixed values. When the external coordinates are varied, paths are changed. The path of a phase is the path to which that phase belongs. With reference to any ensemble of systems we shall denote by ~D\P the average value of the density-in-phase in a path. This implies that we have a measure for comparing different portions of the path. We shall suppose the time required to traverse any portion of a path to be its measure for the purpose of determining this average. With this understanding, let us suppose that a certain ensemble is in statistical equilibrium. In every element of extension-in-phase, therefore, the density-in-phase D is equal to its path-average D\p. Let a sudden small change be made in the external coordinates. The statistical equilibrium will be disturbed and we shall no longer have D — ~I)\P everywhere. This is not because D is changed, but because TJ\P is changed, the paths being changed. It is evident that if D > D\p in a part of a path, we shall have D < D\p in other parts of the same path. Now, if we should imagine a further change in the external coordinates of the same kind, we should expect it to produce an effect of the same kind. But the manner in which the second effect will be superposed on the first will be different, according as it occurs immediately after the first change or after an interval of time. If it occurs immediately after the first change, then in any element of phase in which the first change produced a positive value of D - 7T\p the second change will add a positive value to the first positive value, and where D - ~JD\P was negative, the second change will add a negative value to the first negative value. But if we wait a sufficient time before making the second change in the external coordinates, so that systems have passed from elements of phase in which D - T)\p was originally positive to elements in which it was originally negative* and vice versa, (the systems carrying with them the valuesON AN ENSEMBLE OF SYSTEMS. 157 of D - TJ\p,) the positive values of D - U\p caused by the second change will be in part superposed on negative values due to the first change, and vice versa. The disturbance of statistical equilibrium, therefore, produced by a given change in the values of the external coordinates may be very much diminished by dividing the change into two parts separated by a sufficient interval of time, and a sufficient interval of time for this purpose is one in which the phases of the individual systems áre entirely unlike the first, so that any individual system is differently affected by the change, although the whole ensemble is affected in nearly the same way. Since there is no limit to the diminution of the disturbance of equilibrium by division of the change in the external coordinates, we may suppose as a general rule that by diminishing the velocity of the changes in the external coordinates, a given change may be made to produce a very small disturbance of statistical equilibrium. If we write r[ for the value of the average index of probability before the variation of the external coordinates, and rj" for the value after this variation, we shall have in any case “V/ ^ ~r 7] 7) as the simple result of the variation of the external coordinates. This may be compared with the thermodynamic theorem that the entropy of a body cannot be diminished by mechanical (as distinguished from thermal) action.* If Ave have (approximate) statistical equilibrium between the times t' and if' (corresponding to rj and ^"), we shall have approximately “7 _~ ff which may be compared with the thermodynamic theorem that the entropy of a body is not (sensibly) affected by mechanical action, during which the body is at each instant (sensibly) in a state of thermodynamic equilibrium. Approximate statistical equilibrium may usually be attained * The correspondences to which the reader’s attention is called are between — 91 and entropy, and between 0 and temperature.158 EFFECT OF VARIOUS PROCESSES by a sufficiently slow variation of the external coordinates, just as approximate thermodynamic equilibrium may usually be attained by sufficient slowness in the mechanical operations to which the body is subject. We now pass to the consideration of the effect on an ensemble of systems which is produced by the action of other ensembles with which it is brought into dynamical connection. In a previous chapter * we have imagined a dynamical connection arbitrarily created between the systems of two ensembles. We shall now regard the action between the systems of the two ensembles as a result of the variation of the external coordinates, which causes such variations of the internal coordinates as to bring the systems of the two ensembles within the range of each other’s action. Initially, we suppose that we have two separate ensembles of systems, JE1 and The numbers of degrees of freedom of the systems in the two ensembles will be denoted by nt and n2 respectively, and the probability-coefficients by and e^. Now we may regard any system of the first ensemble combined with any system of the second as forming a single system of nx + n2 degrees of freedom. Let us consider the ensemble (j&t12) obtained by thus combining each system of the first ensemble with each of the second. At the initial moment, which may be specified by a single accent, the probability-coefficient of any phase of the combined systems is evidently the product of the probability-coefficients of the phases of which it is made up. This may be expressed by the equation, ii ' > (455) or Via = Vi + V21 (456) which gives V12 =V1 + y* • (457) The forces tending to vary the internal coordinates of the combined systems, together with those exerted by either system upon the bodies represented by the coordinates called * See Chapter IV, page 37.ON AN ENSEMBLE OF SYSTEMS. 159 external, may be derived from a single force-function, which, taken negatively, we shall call the potential energy of the combined systems and denote by e12. But we suppose that initially none of the systems of the two ensembles JEX and E2 come within range of each other’s action, so that the potential energy of the combined system falls into two parts relating separately to the systems which are combined. The same is obviously true of the kinetic energy of the combined compound system, and therefore of its total energy. This may be expressed by the equation e12'=€/ + €/, (458) which gives e12' = e/ + e2'. (459) Let us now suppose that in the course of time, owing to the motion of the bodies represented by the coordinates called external, the forces acting on the systems and consequently their positions are so altered, that the systems of the ensembles E1 and E2 are brought within range of each other’s action, and after such mutual influence has lasted for a time, by a further change in the external coordinates, perhaps a return to their original values, the systems of the two original ensembles are brought again out of range of each other’s action. Finally, then, at a time specified by double accents, we shall have as at first 6i2" = ii" + i2". (460) But for the indices of probability we must write * Vi" + W' ^ W'- (461) The considerations adduced in the last chapter show that it is safe to write W' < W- (462) We have therefore Vin + *72,; S Vi* + vJy (463) which may be compared with the thermodynamic theorem that * See Chapter XI, Theorem VII.160 EFFECT OF VARIOUS PROCESSES die thermal contact of two bodies may increase but cannot diminish the sum of their entropies. Let us especially consider the case in which the two original ensembles were both canonically distributed in phase with the respective moduli ©2 and ®2. We have then, by Theorem III of Chapter XI, (464) (465) Whence with (463) we have ei+< fii+yL ®! ®2 — ©j @2 (466) or >0. (467) If we write W for the average work done by the combined systems on the external bodies, we have by the principle of the conservation of energy W = €12' - e12" = €/ - €/' + e2' - ¿2". (468) Now if W is negligible, we have €i" - ii' = - (ej* - (469) and (467) shows that the ensemble which has the., greater modulus must lose energy. This result may be compared to the thermodynamic principle, that when two bodies of different temperatures are brought together, that which has the higher temperature will lose energy. Let us next suppose that the ensemble is originally canonically distributed with the modulus 02, but leave the distribution of the other arbitrary. We have, to determine the result of a similar process,ON AN ENSEMBLE OF SYSTEMS. 161 Hence which may be written ~ // . c2 1 c2 V' + ®- ^ ^ fit — fi tf — f ~ n *2 C2 *= —®r~ (470) (471) This may be compared with the thermodynamic principle that when a body (which need not be in thermal equilibrium) is brought into thermal contact with another of a given temperature, the increase of entropy of the first cannot be less (algebraically) than the loss_of heat by thé second divided by its temperature. Where W is negligible, we may write (472) Now, by Theorem III of Chapter XI, the quantity » +1 <473> has a minimum value when the ensemble to which rjx and ex relate is distributed canonically with the modulus ©2. If the ensemble had originally this distribution, the sign < in (472) would be impossible. In fact, in this case, it would be easy to show that the preceding formulae on which (472) is founded would all have the sign = . But when the two ensembles are not both originally distributed canonically with the same modulus, the formulae indicate that the quantity (473) may be diminished by bringing the ensemble to which ex and rjx relate into connection with another which is canonically distributed with modulus ©2, and therefore, that by repeated operations of this kind the ensemble of which the original distribution was entirely arbitrary might be brought approximately into a state of canonical distribution with the modulus ©2. We may compare this with the thermodynamic principle that a body of which the original thermal state may be entirely arbitrary, may be brought approximately into a state of thermal equilibrium with any given temperature by repeated connections with other bodies of that temperature.162 EFFECT OF VARIOUS PROCESSES Let us now suppose that we have a certain number of ensembles, H0, Ex, J/2, etc., distributed canonically with the respective moduli ®0, ®x , ®2, etc. By variation of the external coordinates of the ensemble U0, let it be brought into connection with , and then let the connection be broken. Let it then be brought into connection with U2, and then let that connection be broken. Let this process be continued with respect to the remaining ensembles. We do not make the assumption, as in some cases before, that the work connected with the variation of the external coordinates is a negligible quantity. On the contrary, we wish especially to consider the case in which it is large. In the final state of the ensemble JS0, let us suppose that the external coordinates have been brought back to their original values, and that the average energy (e0) is the same as at first. In our usual notations, using one and two accents to distinguish original and final values, we get by repeated applications of the principle expressed in (463) V + yif + yJ + e^c* = Von + yiM + yj1 + etc. (474) But by Theorem III of Chapter XI, Hence or, since + Hr - + it’ ©. etc. + !!_+!*. + etc. > £ + + etc. ©o ©i ®* “ ©0 ©1 ®2 r f —— p1 f f e0 — €0 9 (475) (476) (477) (478) O^i^ + l^ + etc. (479) If we write W for the average work done on the bodies represented by the external coordinates, we haveON AN ENSEMBLE OF SYSTEMS. 163 ex' - ex" + ej - e2" + etc. = W. (480) If i?0, Ev and E2 are the only ensembles, we have W ;S («i' - ¿x"). (481) It will be observed that the relations expressed in the last three formulae between W9 et — e^, e2' — e2//> etc., and Bv ©2, etc. are precisely those which hold in a Carnot’s cycle for the work obtained, the energy lost by the several bodies which serve as heaters or coolers, and their initial temperatures. It will not escape the reader’s notice, that while from one point of view the operations which are here described are quite beyond our powers of actual performance, on account of the impossibility of handling the immense number of systems which are involved, yet from another point of view the operations described are the most simple and accurate means of representing what actually takes place in our simplest experiments in thermodynamics. The states of the bodies which we handle are certainly not known to us exactly. What we know about a body can generally be described most accurately and most simply by saying that it is one taken at random from a great number (ensemble) of bodies which are completely described. If we bring it into connection with another body concerning which we have a similar limited knowledge, the state of the two bodies is properly described as that of a pair of bodies taken from a great number (ensemble) of pairs which are formed by combining each body of the first ensemble with each of the second. Again, when we bring one body into thermal contact with another, for example, in a Carnot’s cycle, when we bring a mass of fluid into thermal contact with some other body from which we wish it to receive heat, we may do it by moving the vessel containing the fluid. This motion is mathematically expressed by the variation of the coordinates which determine the position of the vessel. We allow ourselves for the purposes of a theoretical discussion to suppose that the walls of this vessel are incapable of absorbing heat from the fluid.164 EFFECT OF VARIOUS PROCESSES. Yet while we exclude the kind of action which we call thermal between the fluid and the containing vessel, we allow the kind which we call work in the narrower sense, which takes place when the volume of the fluid is changed by the motion of a piston. This agrees with what we have supposed in regard to the external coordinates, which we may vary in any arbitrary manner, and are in this entirely unlike the coordinates of the second ensemble with which we bring the first into connection. When heat passes in any thermodynamic experiment between the fluid principally considered and some other body, it is actually absorbed and given out by the walls of the vessel, which will retain a varying quantity. This is, however, a disturbing circumstance, which we suppose in some way made negligible, and actually neglect in a theoretical discussion. In our case, we suppose the walls incapable of absorbing energy, except through the motion of the external coordinates, but that they allow the systems which they contain to act directly on one another. Properties of this kind are mathematically expressed by supposing that in the vicinity of a certain surface, the position of which is determined by certain (external) coordinates, particles belonging to the system in question experience a repulsion from the surface increasing so rapidly with nearness to the surface that an infinite expenditure of energy would be required to carry them through it. It is evident that two systems might be separated by a surface or surfaces exerting the proper forces, and yet approach each other closely enough to exert mechanical action on each other.CHAPTER XIV. DISCUSSION OF THERMODYNAMIC ANALOGIES. If we wish to find in rational mechanics an a priori foundation for the principles of thermodynamics* we must seek mechanical definitions of temperature and entropy. The quantities thus defined must satisfy (under conditions and with limitations which again must be specified in the language of mechanics) the differential equation de = Tdrj — Ax dax —• A* da2 — etc., (482) where e, T, and rj denote the energy, temperature, and entropy of the system considered, and Atdav etc., the mechanical work (in the narrower sense in which the term is used in thermodynamics, L e., with exclusion of thermal action) done upon external bodies. This implies that we are able to distinguish in mechanical terms the thermal action of one system on another from that which we call mechanical in the narrower sense, if not indeed in every case in which the two may be combined, at least so as to specify cases of thermal action and cases of mechanical action. Such a differential equation moreover implies a finite equation between e, 77, and av Vfor any part is equal to its average value for any other part, and to the uniform value of the same expression for the whole ensemble. This corresponds to the theorem in the theory of heat that in case of thermal equilibrium the temperatures of the parts of a body are equal to one another and to that of the whole body.THERMODYNAMIC ANALOGIES. 171 Since the energies of the parts of a body cannot be supposed to remain absolutely constant, even where this is the case with respect to the whole body, it is evident that if we regard the temperature as a function of the energy, the taking of average or of probable values, or some other statistical process, must be used with reference to the parts, in order to get a perfectly definite value corresponding to the notion of temperature. It is worthy of notice in this connection that the average value of the kinetic energy, either in a microcanonical ensemble, or in a canonical, divided by one half the number of degrees of freedom, is equal to V, or to its average value, and that this is true not only of the whole system which is distributed either microcanonically or canonically, but also of any part, although the corresponding theorem relating to temperature hardly belongs to empirical thermodynamics, since neither the (inner) kinetic energy of a body, nor its number of degrees of freedom is immediately cognizable to our faculties, and we meet the gravest difficulties when we endeavor to apply the theorem to the theory of gases, except in the simplest case, that of the gases known as monatomic. But the correspondence between V or dejdlog V and temperature is imperfect. If two isolated systems have such energies that det __ de2 d log Vx~ d log V2 ’ and the two systems are regarded as combined to form a third system with energy €12 = + €2> we shall not have in general dc-i 2 d€i dc 2 d log F12 d log Fi ~~ d log V2 as analogy with temperature would require. In fact, we have seen that de i. d log Fis d€i ¿log Fi172 THERMODYNAMIC ANALOGIES. where the second and third members of the equation denote average values in an ensemble in which the compound system is microeanonically distributed in phase. Let us suppose the two original systems to be identical in nature. Then '— ^2 ^1 ¡«12 — ^2 * The equation in question would require that dex d log Fi dex d log Fi e12 i. e.9 that we get the same result, whether we take the value of dejd log Vx determined for the average value of e1 in the ensemble, or take the average value of dejd log Vv This will be the case where dejd log V1 is a linear function of ev Evidently this does not constitute the most general case. Therefore the equation in question cannot be true in general. It is true, however, in some very important particular cases, as when the energy is a quadratic function of the and ^’s, or of the ys alone.* When the equation holds, the case is analogous to that of bodies in thermodynamics for which the specific heat for constant volume is constant. Another quantity which is closely related to temperature is d

2, the average value of d

i dex *12 d 2 de2 *12 d±12 de 12 if nt > 2, and w2 > 2. This analogy with temperature has the same incompleteness which was noticed with respect to de/d log V, viz., if two systems have such energies (ex and e2) that d*f> i dcf> 2 and they are combined to form a third system with energy €12 == €1 + C2, we shall not have in general dfi 12 d i dcf>2 dei2 dei de2 ’ Thus, if the energy is a quadratic function of the jp’s and q*s, we have * dfa __n1 — \ d2 _n2 — 1 dex ex ’ 12 _ ^12 — 1 __ ^1 + ^2 — 1 de 12 €12 €i + ^2 where nl9 n2, n12> are the numbers of degrees of freedom of the separate and combined systems. But di ___ d2 __ nx + n2 — 2 de2 de2 ex + e2 If the energy is a quadratic function of the jp’s alone, the case would be the same except that we should have \ nx, \ n2, | w12, instead of nx, w2, n12. In these particular cases, the analogy * See foot-note on page 93. We have here made the least value of the energy consistent with the values of the external coordinates zero instead of €«, as is evidently allowable when the external coordinates are supposed invariable.174 THERMODYNAMIC ANALOGIES. between de/d log V and temperature would be complete, as has already been remarked. We should have de i de2 et d logFi _ nx dlog de12 — ilH d€\ dc2 d log Vn Wl2 d log Vx d log Vi when the energy is a quadratic function of the p’s and q9s, and similar equations with \ nx, n2, J nl2, instead of nx, n2, n12, when the energy is a quadratic function of the p’s alone. More characteristic of dcjy/de are its properties relating to most probable values of energy. If a system having two parts with separate energies and. each with more than two degrees of freedom is mieroeanonically distributed in phase, the most probable division of energy between the parts, in a system taken at random from the ensemble, satisfies the equation di d(jy 2 (488) which corresponds to the thermodynamic theorem that the distribution of energy between the parts of a system, in case of thermal equilibrium, is such that the temperatures of the parts are equal. To prove the theorem, we observe that the fractional part of the whole number of systems which have the energy of one part between the limits e/ and €Xf is expressed by —0i2 r 01+02, e I e dex, €i' where the variables are connected by the equation €l + e2 = constant = e12. The greatest value of this expression, for a constant infinitesimal value of the difference e*" — e/, determines a value of e1, which we may call its most probable value. This depends on the greatest possible value of fa + fa. Now if nx > 2, and n2 > 2, we shall have fa = — oo for the least possible value ofTHERMODYNAMIC ANALOGIES. 175 e1, and 02 = — oo for the least possible value of e2. Between, these limits (/q and 02 will be finite and continuous. Hence i -f 02 will have a maximum satisfying the equation (488). But if nx < 2, or w2 < 2, d^>1fde1 or will correspond to entropy. It has been defined as log (d V/de). In the considerations on which its definition is founded, it is therefore very similar to log V, We have seen that defy ¡d log V approaches the value unity when n is very great. * To form a differential equation on the model of the thermodynamic equation (482), in which de/dejy shall take the place of temperature, and of entropy, we may write *=(&).**+(â),/“1+(êl/“-+ et0-’ (489) or defy = ^ de + dax + da2 + etc. (490) de dax da2 v 7 With respect to the differential coefficients in the last equation, which corresponds exactly to (482) solved with respect to dr}, we have seen that their average values in a canonical ensemble are equal to 1/0, and the averages of Ax/©, ^(2/®, etc.f We have also seen that del defy (or defyjde) has relations to the most probable values of energy in parts of a microca-nonical ensemble. That (del da/)^, etc., have properties somewhat analogous, may be shown as follows. In a physical experiment, we measure a force by balancing it against another. If we should ask what force applied to increase or diminish ax would balance the action of the systems, it would be one which varies with the different systems. But we may ask what single force will make a given value of ax the most probable, and we shall find that under certain conditions (de/dajfaa represents that force. * See Chapter X, pages 120, 121. t See Chapter IX, equations (321), (327).THERMODYNAMIC ANALOGIES. 177 To make the problem definite, let ns consider a system consisting of the original system together with another having the coordinates ax, a2 , etc., and forces Ax', A2', etc., tending to increase those coordinates. These are in addition to the forces Av Av etc., exerted by the original system, and are derived from a force-function (— e2') by the equations A' = - del dax 9 etc. For the energy of the whole system we may write E = e + eqr + im1a12 + \m2 + etc., and for the extension-in-phase of the whole system within any limits or or again J*. . .J*dpt . . . dqn dax mx dax da2 m2 da2 . . . S* * *Se<^^aimi^aim2^°2 99 ^dE dax mx dax dd2 m2 da2 . . ., since de = cZE, when ax, al9 a2, a2, etc., are constant. If the limits are expressed by E and E + cZE, ax and ax + dax, ax and ax + dax, etc., the integral reduces to cZEdalm1 da3 da2m2 da2... The values of ax, ax, a2, a2, etc., which make this expression a maximum for constant values of the energy of the whole system and of the differentials cZE, dax, dax, etc., are what may be called the most probable values of ax, ax, etc., in an ensemble in which the whole system is distributed microcanonically. To determine these values we have de* = 0, when iZ(e + eq9 + £ m ax2 + \m2 a22 + etc.) = 0. That is, d(j> = 0,178 THERMODYNAMIC ANALOGIES. when ( d(j> -f f da, — Atf da^ -f ete. + mx ax dax -f- etc. = 0. \aijia This requires ax = 0, a2 = 0, etc., and etc. This shows that for any given values of E, a19 a29 etc. , etc., represent the forces (in the gen- , CL eralized sense) which the external bodies would have to exert to make these values of a19 a2> the most probable under the conditions specified. When the differences of the external forces which are exerted by the different systems are negligible,— (de/da^)^a, etc., represent these forces. It is certainly in the quantities relating to a canonical ensemble, e, ©, ??, Ax, etc., ax, etc., that we find the most complete correspondence with the quantities of the thermodynamic equation (482). Yet the conception itself of the canonical ensemble may seem to some artificial, and hardly germane to a natural exposition of the subject; and the quantities €’ dlogF’ log V' eteM etc*’ 0re’ d4>' etc., ax, etc., which are closely related to ensembles of constant energy, and to average and most probable values in such ensembles, and most of which are defined without reference to any ensemble, may appear the most natural analogues of the thermodynamic quantities. In regard to the naturalness of seeking analogies with the thermodynamic behavior of bodies in canonical or microca-nonical ensembles of systems, much will depend upon how we approach the subject, especially upon the question whether we regard energy or temperature as an independent variable. It is very natural to take energy for an independent variable rather than temperature, because ordinary mechanics furnishes us with a perfectly defined conception of energy, whereas the idea of something relating to a mechanical system and corre-THERMODYNAMIC ANALOGIES. 179 sponding to temperature is a notion but vaguely defined. Now if the state of a system is given by its energy and the external coordinates, it is incompletely defined, although its partial definition is perfectly clear as far as it goes. The ensemble of phases microcanonically distributed, with the given values of the energy and the external coordinates, will represent the imperfectly defined system better than any other ensemble or single phase. When we approach the subject from this side, our theorems will naturally relate to average values, or most probable values, in such ensembles. In this case, the choice between the variables of (485) or of (489) will be determined partly by the relative importance which is attached to average and probable values. It would seem that in general average values are the most important, and that they lend themselves better to analytical transformations. This consideration would give the preference to the system of variables in which log V is the analogue of entropy. Moreover, if we make 6 the analogue of entropy, we are embarrassed by the necessity of making numerous exceptions for systems of one or two degrees of freedom. On the other hand, the definition of cf> may be regarded as a little more simple than that of log F, and if our choice is determined by the simplicity of the definitions of the analogues of entropy and temperature, it would seem that the system should have the preference. In our definition of these quantities, F was defined first, and e* derived from V by differentiation. This gives the relation of the quantities in the most simple analytical form. Yet so far as the notions are concerned, it is perhaps more natural to regard F as derived from by integration. At all events, e* may be defined independently of F, and its definition may be regarded as more simple as not requiring the determination of the zero from which F is measured, which sometimes involves questions of a delicate nature. In fact, the quantity e* may exist, when the definition of F becomes illusory for practical purposes, as the integral by which it is determined becomes infinite. The case is entirely different, when we regard the tempera-180 THERMODYNAMIC ANALOGIES. ture as an independent variable, and we have to consider a system which is described as having a certain temperature and certain values for the external coordinates. Here also the state of the system is not completely defined, and will be better represented by an ensemble of phases than by any single phase. What is the nature of such an ensemble as will best represent the imperfectly defined state ? When we wish to give a body a certain temperature, we place it in a bath of the proper temperature, and when we regard what we call thermal equilibrium as established, we say that the body has the same temperature as the bath. Perhaps we place a second body of standard character, which we call a thermometer, in the bath, and say that the first body, the bath, and the thermometer, have all the same temperature. But the body under such circumstances, as well as the bath, and the thermometer, even if they were entirely isolated from external influences (which it is convenient to suppose in a theoretical discussion), would be continually changing in phase, and in energy as well as in other respects, although our means- of observation are not fine enough to perceive these variations. The series of phases through which the whole system runs in the course of time may not be entirely determined by the energy, but may depend on the initial phase in other respects. In such cases the ensemble obtained by the microcanonical distribution of the whole system, which includes all possible time-ensembles combined in the proportion which seems least arbitrary, will represent better than any one time-ensemble the effect of the bath. Indeed a single time-ensemble, when it is not also a microcanonical ensemble, is too ill-defined a notion to serve the purposes of a general discussion. We will therefore direct our attention, when we suppose the body placed in a bath, to the microcanonical ensemble of phases thus obtained. If we now suppose the quantity of the substance forming the bath to be increased, the anomalies of the separate energies of the body and of the thermometer in the microcanonicalTHERMODYNAMIC ANALOGIES, 181 ensemble will be increased, but not without limit. The anomalies of the energy of the bath, considered in comparison with its whole energy, diminish indefinitely as the quantity of the bath is increased, and become in a sense negligible, when the quantity of the, bath is sufficiently increased. The ensemble of phases of the body, and of the thermometer, approach a standard form as the quantity of the bath is indefinitely increased. This limiting form is easily shown to be what we have described as the canonical distribution. Let us write e for the energy of the whole system consisting of the body first mentioned, the bath, and the thermometer (if any), and let us first suppose this system to be distributed canonically with the modulus ®. We have by (205) —---- (e-i) 2 = ©2^, and since de __n de d0 ^ d€p If we writ9 Ae for the anomaly of mean square, we have (Ae)2 = (e - i)2 d% If we set AOrrr—Ae, de A® will represent approximately the increase of ® which would produce an increase in the average value of the energy equal to its anomaly of mean square. Now these equations give 2©2de0 which shows that we may diminish A® indefinitely by increasing the quantity of the bath. Now our canonical ensemble consists of an infinity of micro-canonical ensembles, which differ only in consequence of the different values of the energy which is constant in each. If we consider separately the phases of the first body which182 THERMODYNAMIC ANALOGIES. occur in the canonical ensemble of the whole system, these phases will form a canonical ensemble of the same modulus. This canonical ensemble of phases of the first body will consist of parts which belong to the different microcanonical ensembles into which the canonical ensemble of the whole system is divided. Let us now imagine that the modulus of the principal canonical ensemble is increased by 2 A®, and its average energy by 2Ae. The modulus of the canonical ensemble of the phases of the first body considered separately will be increased by 2 A®. We may regard the infinity of microcanonical ensembles into which we have divided the principal canonical ensemble as each having its energy increased by 2 A e. Let us see how the ensembles of phases of the first body contained in these microcanonical ensembles are affected. We may assume that they will all be affected in about the same way, as all the differences which come into account may be treated as small. Therefore, the canonical ensemble formed by taking them together will also be affected in the same way. But we know how this is affected. It is by the increase of its modulus by 2A®, a quantity which vanishes when the quantity of the bath is indefinitely increased. In the case of an infinite bath, therefore, the increase of the energy of one of the microcanonical ensembles by 2Ae, produces a vanishing effect on the distribution in energy of the phases of the first body which it contains. But 2Ae is more than the average difference of energy between the micro-canonical ensembles. The distribution in energy of these phases is therefore the same in the different microcanonical ensembles, and must therefore be canonical, like that of the ensemble which they form when taken together. * * In order to appreciate the above reasoning, it should be understood that the differences of energy which occur in the canonical ensemble of phases of the first body are not here regarded as vanishing quantities. To fix one’s ideas, one may imagine that he has the fineness of perception tó make these differences seem large. The difference between the part of these phases which belong to one microcanonical ensemble of the whole system and the part which belongs to another would still be imperceptible, when the quantity of the bath is sufficiently increased.THERMODYNAMIC ANALOGIES. 183 As a general theorem, the conclusion may be expressed in the words: — If a system of a great number of degrees of freedom is microcanonically distributed in phase, any very small part of it may be regarded as canonically distributed.* It would seem, therefore, that a canonical ensemble of phases is what best represents, with the precision necessary for exact mathematical reasoning, the notion of a body with a given temperature, if we conceive of the temperature as the state produced by such processes as we actually use in physics to produce a given temperature. Since the anomalies of the body increase with the quantity of the bath, we can only get rid of all that is arbitrary in the ensemble of phases which is to represent the notion of a body of a given temperature by making the bath infinite, which brings us to the canonical distribution. A comparison of temperature and entropy with their,analogues in statistical mechanics would be incomplete without a consideration of their differences with respect to units and zeros, and the numbers used for their numerical specification. If we apply the notions of statistical mechanics to such bodies as we usually consider in thermodynamics, for which the kinetic energy is of the same order of magnitude as the unit of energy, but the number of degrees of freedom is enormous, the values of ®, de/d log V, and de/dcf) will be of the same order of magnitude as 1/n, and the variable part of log V, and will be of the same order of magnitude as n.f If these quantities, therefore, represent in any sense the notions of temperature and entropy, they will nevertheless not be measured in units of the usual order of. magnitude, — a fact which must be borne in mind in determining what magnitudes may be regarded as insensible to human observation. Now nothing prevents our supposing energy and time in our statistical formulae to be measured in such units as may * It is assumed — and without this assumption the theorem would have no distinct meaning—that the part of the ensemble considered may be regarded as having separate energy. f See equations (124), (288), (289), and (314); also page 106.184 THERMODYNAMIC ANALOGIES. be convenient for physical purposes. But when these units have been chosen, the numerical values of ©, de/dlogV, de/dfy, rj, log V, , are entirely determined,* and in order to compare them with temperature and entropy, the numerical values of which depend upon an arbitrary unit, we must multiply all values of ©, de/dlogV, dejdfy by a constant (iT), and divide all values of 97, log V, and $ by the same constant. This constant is the same for all bodies, and depends only on the units of temperature and energy which we employ. For ordinary units it is of the same order of magnitude as the numbers of atoms in ordinary bodies. We are not able to determine the numerical value of K, as it depends on the number of molecules in the bodies with which we experiment. To fix our ideas, however, we may seek an expression for this value, based upon very probable assumptions, which will show how we would naturally proceed to its evaluation, if our powers of observation were fine enough to take cognizance of individual molecules. If the unit of mass of a monatomic gas contains v atoms, and it may be treated as a system of 3 v degrees of freedom, which seems to be the case, we have for canonical distribution If we write T for temperature, and cv for the specific heat of the gas for constant volume (or rather the limit toward which this specific heat tends, as rarefaction is indefinitely increased), We have since we may regard the energy as entirely kinetic. We may set the €p of this equation equal to the ep of the preceding, * The unit of time only affects the last three quantities, and these only by an additive constant, which disappears (with the additive constant of entropy), when differences of entropy are compared with their statistical analogues. See page 19. (491) (492)THERMODYNAMIC ANALOGIES. 185 where indeed the individual values of which the average is taken would appear to human observation as identical. This gives d® _ 2cv dT~ 3v 9 1 2c whence -g. = • (493) a value recognized by physicists as a constant independent of the kind of monatomic gas considered. We may also express the value of JTin a somewhat different form, which corresponds to the indirect method by which physicists are accustomed to determine the quantity c9. The kinetic energy due to the motions of the centers of mass of the molecules of a mass of gas sufficiently expanded is easily shown to be equal to iP*> where p and v denote the pressure and volume. The average value of the same energy in a canonical ensemble of such a mass of gas is I®*, where v denotes the number of molecules in the gas. ing these values, we have p v =V, whence 1 __ ©____pv Equat- (494) (495) Now the laws of Boyle, Charles, and Avogadro may be expressed by the equation pv = Ay Tf (496) where A is a constant depending only on the units in which energy and temperature are measured. 1 /J5T, therefore, might be called the constant of the law of Boyle, Charles, and Avogadro as expressed with reference to the true number of molecules in a gaseous body. Since such numbers are unknown to us, it is more convenient to express the law with reference to relative values. If we denote by M the so-called molecular weight of a gas, that186 THERMODYNAMIC ANALOGIES. is, a number taken from a table of numbers proportional to the weights of various molecules and atoms, but having one of the values, perhaps the atomic weight of hydrogen, arbitrarily made unity, the law of Boyle, Charles, and Avogadro may be written in the more practical form Pv. = A't£, (497) where A! is a constant and m the weight of gas considered. It is evident that 1 K is equal to the product of the constant of the law in this form and the (true) Weight of an atom of hydrogen, or such other atom or molecule as may be given the value unity in the table of molecular weights. In the following chapter we shall consider the necessary modifications in the theory of equilibrium, when the quantity of matter contained in a system is to be regarded as variable, or, if the system contains more than one kind of matter, when the quantities of the several kinds of matter in the system are to be regarded as independently variable. This will give us yet another set of variables in the statistical equation, corresponding to those of the amplified form of the thermodynamic equation.CHAPTER XV. SYSTEMS COMPOSED OF MOLECULES. The nature of material bodies is such that especial interest attaches to the dynamics of systems composed of a great number of entirely similar particles, or, it may be, of a great number of particles of several kinds, all of each kind being entirely similar to each other. We shall therefore proceed to consider systems composed of such particles, whether in great numbers or otherwise, and especially to consider the statistical equilibrium of ensembles of such systems. One of the variations to be considered in regard to such systems is a variation in the numbers of the particles of the various kinds which it contains, and the question of statistical equilibrium between two ensembles of such systems relates in part to the tendencies of the various kinds of particles to pass from the one to the other. First of all, we must define precisely what is meant by statistical equilibrium of such an ensemble of systems. The essence of statistical equilibrium is the permanence of the number of systems which fall within any given limits with respect to phase. We have therefore to define how the term “ phase ” is to be understood in such cases. If two phases differ only in that certain entirely similar particles have changed places with one another, are they to be regarded as identical or different phases? If the particles are regarded as indistinguishable, it seems in accordance with the spirit of the statistical method to regard the phases as identical. In fact, it might be urged that in such an ensemble of systems as we are considering no identity is possible between the particles of different systems except that of qualities, and if v particles of one system are described as entirely similar to one another and to v of another system, nothing remains on which to base188 SYSTEMS COMPOSED OF MOLECULES. the indentification of any particular particle of the first system with any particular particle of the second. And this would be true, if the ensemble of systems had a simultaneous objective existence. But it hardly applies to the creations of the imagination. In the cases which we have been considering, and in those which we shall consider, it is not only possible to conceive of the motion of an ensemble of similar systems simply as ; possible cases of the motion of a single system, but it is actually in large measure for the sake of representing more clearly the possible cases of the motion of a single system that we use the conception of an ensemble of systems. The perfect similarity of several particles of a system will not in the least interfere with the identification of a particular particle in one case with a particular particle in another. The question is one to be decided in accordance with the requirements of practical convenience in the discussion of the problems with which we are engaged. Our present purpose will often require us to use the terms phase, density-in-phase, statistical equilibrium, and other connected terms on the supposition that phases are not altered by the exchange of places between similar particles. Some of the most important questions with which we are concerned have reference to phases thus defined. We shall call them phases determined by generic definitions, or briefly, generic phases. But we shall also be obliged to discuss phases defined by the narrower definition (so that exchange of position between similar particles is regarded as changing the phase), which will be called phases determined by specific definitions, or briefly, specific phases. For the analytical description of a specific phase is more simple than that of a generic phase. And it is a more simple matter to make a multiple integral extend over all possible specific phases than to make one extend without repetition over all possible generic phases. It is evident that if vl9 v2 . . . vhf are the numbers of the different kinds of molecules in any system, the number of specific phases embraced in one generic phase is represented by the continued product |px [y2. . . |z^, and the coefficient of probabil-SYSTEMS COMPOSED OF MOLECULES. 189 ity of a generic phase is the sum of the probability-coefficients of the specific phases which it represents. When these are equal among themselves, the probability-coefficient of the generic phase is equal to that of the specific phase multiplied by | Pi | i/a .. . | It is also evident that statistical equilibrium may subsist with respect to generic phases without statistical equilibrium with respect to specific phases, but not vice versa. Similar questions arise where one particle is capable of several equivalent positions. Does the change from one of these positions to another change the phase? It would be most natural and logical to make it affect the specific phase, but not the generic. The number of specific phases contained in a generic phase would then be |ktVl . . . xhv*, where kv . . . Kh denote the numbers of equivalent positions belonging to the several kinds of particles. The case in which a k is infinite would then require especial attention. It does not appear that the resulting complications in the formulae would be compensated by any real advantage. The reason of this is that in problems of real interest equivalent positions of a particle will always be equally probable. In this respect, equivalent positions of the same particle are entirely unlike the [¡^different ways in which v particles may be distributed in v different positions. Let it therefore be understood that in spite of the physical equivalence of different positions of the same particle they are to be considered as constituting a difference of generic phase as well as of specific. ’The number of specific phases contained in a generic phase is therefore always given by the product |zg \v2 . . . \vh. Instead of considering, as in the preceding chapters, ensembles of systems differing only in phase, we shall now suppose that the systems constituting an ensemble are composed of particles of various kinds, and that they differ not only in phase but also in the numbers of these particles which they contain. The external coordinates of all the systems in the ensemble are supposed, as heretofore, to have the same value, and when they vary, to vary together. For distinction, we may call such an ensemble a grand ensemble, and one in190 SYSTEMS COMPOSED OF MOLECULES. which the systems differ only in phase a petit ensemble. A grand ensemble is therefore composed of a multitude of petit ensembles. The ensembles which we have hitherto discussed are petit ensembles. Let v19 ... vh9 etc., denote the numbers of the different kinds of particles in a system, e its energy, and q19 . . . qn, p1, ... pn its coordinates and momenta. If the particles are of the nature of material points, the number of coordinates (n) of the system will be equal to 3 vx . . . + 3^. But if the particles are less simple in their nature, if they are to be treated as rigid solids, the orientation of which must be regarded, or if they consist each of several atoms, so as to have more than three degrees of freedom, the number of coordinates of the system will be equal to the sum of zq, p2, etc., multiplied each by the number of degrees of freedom of the kind of particle to which it relates. Let us consider an ensemble in which the number of systems having v19 . . . vh particles of the several kinds, and having values of their coordinates and momenta lying between the limits qx and q} + dql9 px and px + dpX9 etc., is represented by the expression n-f-^ivi... -f/w-« Ne 0 ------—dpx...dq„ (498) where N, fl, <£>, , . . . fih are constants, N denoting the total number of systems in the ensemble. The expression Ne @ (499) evidently represents the density-in-phase of the ensemble within the limits described, that is, for a phase specifically defined. The expression n+w1...+wh-€ (500)SYSTEMS COMPOSED OF MOLECULES. 191 is therefore the probability-coefficient for a phase specifically defined. This has evidently the same value for all the jz^ . . . \vh phases obtained by interchanging the phases of particles of the same kind. The probability-coefficient for a generic phase will be [ .. . [z/* times as great, viz., ft+MlPl-y-fHvh-€ e ® (501) We shall say that such an ensemble as has been described is canonically distributed, and shall call the constant © its modulus. It is evidently what we have called a grand ensemble. The petit ensembles of which it is composed are canonically distributed, according to the definitions of Chapter IY, since the expression n+Mi^i • » « 4-nhvh ‘-fa-la <“2> is constant for each petit ensemble. The grand ensemble, therefore, is in statistical equilibrium with respect to specific phases. If an ensemble, whether grand or petit, is identical so far as generic phases are concerned with one canonically distributed, we shall say that its distribution is canonical with respect to generic phases. Such an ensemble is evidently in statistical equilibrium with respect to generic phases, although it may not be so with respect ta specific phases. If we write H for the index of probability of a generic phase in a grand ensemble, we have for the case of canonical distribution H = Q + ^ n- ■ ■ + nn-e _ (503) It will be observed that the H is a linear function of e and vv ... vh; also that whenever the index of probability of generic phases in a grand ensemble is a linear function of e, vv . . . vh, the ensemble is canonically distributed with respect to generic phases.192 SYSTEMS COMPOSED OF MOLECULES. The constant II we may regard as determined by the equation ... fihvh-t jsr or all phases 1^1 • • • ¡La. dpi • . . dqn, (504) Wl ••• Wft -----^-2 all e = . . . '%vh -p-------pj—J*• • • J*& 0 dpi . . . dqnf (505) rriiAnan where the multiple sum indicated by 2^ . .. 2Vft includes all terms obtained by giving to each of the symbols vi . . . vh all integral values from zero upward, aiid the multiple integral (which is to be evaluated separately for each term of the multiple sum) is to be extended over all the (specific) phases of the system having the specified numbers of particles of the various kinds. The multiple integral in the last equation is _± what we have represented by e 0. See equation (92). We may therefore write Wl • • • AW-jt rL^...v (506) It should be observed that the summation includes a term in which all the symbols vt... vh have the value zero. We must therefore recognize in a certain sense a system consisting of no particles, which, although a barren subject of study in itself, cannot well be excluded as a particular case of a system of a variable number of particles. In this case e is constant, and there are no integrations to be performed. We have therefore* e 0 = e 0, i. e.} if/ = e. * This conclusion may appear a little strained. The original definition of ^ may not be regarded as fairly applying to systems of no degrees of freedom. We may therefore prefer to regard these equations as defining ^ in this case.SYSTEMS COMPOSED OF MOLECULES. 193 The value of ep is of course zero in this case. But the value of eQ contains an arbitrary constant, which is generally determined by considerations of convenience, so that eq and e do not necessarily vanish with vx, . . . vh. Unless — il has a finite value, our formulae become illusory. We have already, in considering petit ensembles canonically distributed, found it necessary to exclude cases in which — yjr has not a finite value.* The same exclusion would here make — finite for any finite values of vt . . . vh. This does not necessarily make a multiple series of the form (506) finite. We may observe, however, that if for all values of vx . . . vh — \// < c0 + c± vi, ... + ch vh9 (507) where c0, cv . • . ch are constants or functions of 0, n Cp+fri+C!)»! •. -+(H+Ch)vh A l © e r <518> which agrees with (506). The average value in the grand ensemble of any quantity u9 is given by the formula &+Wl ...+nhvh-e all------------—------ u = Xt... Xhf.. .fUe ^ ^------dp,... dqn. (519) If u is a function of vv . . . vh alone, i. if it has the same value in all systems of any same petit ensemble, the formula reduces to n-buii'i • •. +Wh-y © 5 = ** • • * X>We -\n '—\K---------- (52°) Again, if we write u\mna and u\petit to distinguish averages in the grand and petit ensembles, we shall have a-hw!... -\-fxhvh-$ € ® ^1 grand = Syt • • • X* *¿1 petit . . . \yh In this chapter, in which we are treating of grand ensembles, u will always denote the average for a grand ensemble. In the preceding chapters, u has always denoted the average for a petit ensemble.SYSTEMS COMPOSED OF MOLECULES. 199 Equation (505), which we repeat in a slightly different form, viz., 0 = 2r. . . . 2, phases In... .dpt...dqn, (522) shows that il is a function of ® and ... fih; also of the external coordinates av a2, etc., which are involved implicitly in e. If we differentiate the equation regarding all these quantities as variable, we have ^ 9 ^ ) all phases + ixhvh—e)e Ml^l • •-+AV/-c 0 liä • • • K -djpi...dqn all -----0------ +^i2vi• • • • /** [n-.-b dpi'"d3n 4* etc. phases all • » • ~hPhvh~€ 0 phases -dp1..\dqn, — etc. (523) n „0 If we multiply this equation by e , and set as usual Av Av etc., for — dejdal9 — dejda29 etc., we get in virtue of the law expressed by equation (519), dil Q d® « - ■“ ■©■ + ©2 «© =- —j (Ml V! . . . + IXtVh - €) + iiirv' + ^wv* + etc-+ ^Ji + ^j2 + etc>; (524)200 SYSTEMS COMPOSED OF MOLECULES. that is, da = n-+ ^-Vl-—! a® - S vi dH - S Ii dav H, © Since equation (503) gives ^ + /*1 Vl • • • + Ph vh — € _ © the preceding equation may be written — H ¿0 —“ ^ V\ dfX^ ^ J[^ C?Q5^ • Again, equation (526) gives c?Q + 2 (ii di/x -{- 2 vi — de = © dH + H d©. Eliminating dil from these equations, we get cfe ~ — © d Ii 2 fK\ dvi “ 2 d&i • If we set ^ — e + © H, d& = de + © dH + H d©, we have = H d© + 2/**dvx — 2^i dax. The corresponding thermodynamic equations are de = + 2 fjiidmx — 2 A dax, dif/~— rjdT + 2 fti ^mx — 2 Ax dax. (525) (526) (527) (528) (529) (530) (531) (532) (533) (534) (535) These are derived from the thermodynamic equations (114) and (117) by the addition of the terms necessary to take account of variation in the quantities (mv mv etc.) of the several substances of which a body is composed. The correspondence of the equations is most perfect when the component substances are measured in such units that mv mv etc., are proportional to the numbers of the different kinds of molecules or atoms. The quantities /iv ft2, etc., in these thermodynamic equations may be defined as differential coefficients by either of the equations in which they occur.* * Compare Transactions Connecticut Academy, Vol. in, pages 116 ft.SYSTEMS COMPOSED OF MOLECULES. 201 If we compare the statistical equations (529) and (532) with (114) and (112), which are given in Chapter IV, and discussed in Chapter XIY, as analogues of thermodynamic equations, we find considerable difference. Beside the terms corresponding to the additional terms in the thermodynamic equations of this chapter, and beside the fact that the averages are taken in a grand ensemble in one case and in a petit in the other, the analogues of entropy, H and 77, are quite different in definition and value. We shall return to this point after we have determined the order of magnitude of the usual anomalies of vv ... vh. If we differentiate equation (518) with respect to /¿1, and multiply by ©, we get Of/*!*'! ■.. + Vl)6 '"h'VT.lvT " = °’ (536) whence d£l/dfjLt = — vv which agrees with (527). Differentiating again with respect to fav and to /¿2, and setting dQ, _ - dQ dfx i 1} dfjb2 we get n+Mi*'! • • • +fiht/h~xP Msî?+ (vi — vi)2\e 0 ft (537) © ) la-..la ’ ß+ZVl • • • +Hvh-'l> d2Q ( (Vl- —n) (v2 —v2) y 0 __ _ = 0. (538) dfiidfx2 * © J in... |a The first members of these equations represent the average values of the quantities in the principal parentheses. We have therefore d2Q dv\202 SYSTEMS COMPOSED OF MOLECULES. From equation (539) we may get an idea of the order of magnitude of the divergences of vx from its average value in the ensemble, when that average value is great. The equation may be written (vj --- Vi)2 __ © dv1 V\ Vi (541) The second member of this equation will in general be small when vx is great. Large values are not necessarily excluded, but they must be confined within very small limits with respect to fi. For if (Vl -r- Vi)2 - 2 *1 (542) for all values of fix between the limits fix and ^1", we shall have between the same limits and therefore ¿r* dvi > dpx, Vl* (543) (544) The difference ¡ix — fixf is therefore numerically a very small quantity. To form an idea of the importance of such a difference, we should observe that in formula (498) is multiplied by vx and the product subtracted from the energy. A very small difference in the value of ¡ix may therefore be important. But since v® is always less than the kinetic energy of the system, our formula shows that /ix,f — ¡jlx, even when multiplied by vxf or vxn, may still be regarded as an insensible quantity. We can now perceive the leading characteristics with respect to properties sensible to human faculties of such an ensemble as we are considering (a grand ensemble canonically distributed), when the average numbers of particles of the various kinds are of the same order of magnitude as the number of molecules in the bodies which are the subject of physicalSYSTEMS COMPOSED OF MOLECULES. 203 experiment. Although the ensemble contains systems having the widest possible variations in respect to the numbers of the particles which they contain, these variations are practically contained within such narrow limits as to be insensible, except for particular values of the constants of the ensemble. This exception corresponds precisely to the case of nature, when certain thermodynamic quantities corresponding to ®, yu>i, /¿2, etc., which in general determine the separate densities of various components of a body, have certain values which make these densities indeterminate, in other words, when the conditions are such as determine coexistent phases of matter. Except in the case of these particular values, the grand ensemble would not differ to human faculties of perception from a petit ensemble, viz., any one of the petit ensembles which it contains in which v2, etc., do not sensibly differ from their average values. Let us now compare the quantities H and 77, the average values of which (in a grand and a petit ensemble respectively) we have seen to correspond to entropy. Since H = Q + ^lT/1 ' * ‘ + ft*** ~ 61 and V Ip — e © f H ------ 7] = O + fiivi • . . + — ÿ © (545) A part of this difference is due to the fact that H relates to generic phases and rj to specific. If we write ygeri for the index of probability for generic phases in a petit ensemble, we have %en = V + log [n • • • [>$ > H — 7} = H — 7]gea + log [n • • • {n > = _ log k... )a. (546) (547) (548) This is the logarithm of the probability of the petit ensemble (vj .. . vh).* If we set * See formula (517).204 SYSTEMS COMPOSED OF MOLECULES. *Agen © € “ — Vgeny (549) which corresponds to the equation ifr — € © = we have *^gen = ^ + ® log . . . |vj, and H — TJgea = ^ +_/tlVli'ii* • + PhVh—fgm This will have a maximum when * (550) (551) #pm dv 1 = ^1» dv O = ^2, etc. (552) Distinguishing values corresponding to this maximum by accents, we have approximately, when vx, . . . vh are of the same order of magnitude as the numbers of molecules in ordinary bodies, tr Q + plVl » - • + Wh — *ftgen H ~ — @ Q + PlW • - • + Wh ~ ^gen' © /^Vgen Y (VO2 / ^VgenV / ^Vgen V (A^ft)2 \ dj/i2 y 2 © \^i ^2 / © \ / 2 © 5 (553) fiPÿgenY (¿Vi)2 ( '^genYAl/jAï/g / 'd2ÿ > (Aï'â)2 H~%en e = eae ^ dv^ / 2© V ^dv^dv2/ © \ (554) where C = Q + [*iW • • • + Wh — «/'gen' (555) © ’ and £ II <1 — Vif9 Av2 = v2 — v2', etc. (556) This is the probability of the system (iq . . . z^). The prob-abilty that the values of vx, . . . vh lie within given limits is given by the multiple integral * Strictly speaking, \f/gen is not determined as function of vlt... vh, except for integral values of these variables. Yet we may suppose it to be determined as a continuous function by any suitable process of interpolation.SYSTEMS COMPOSED OF MOLECULES. 205 / /dfygen VC^i)2 _/ gCnyAAv2 / d2\j/gen\'(An)2 I eCe \dyi2 ' 20 \d*\dv%) ® \ dvi? ) 20 J (557) This shows that the distribution of the grand ensemble with respect to the values of vv. . . vh follows the “ law of errors ” when . . . vhf are very great. The value of this integral for the limits ± oo should be unity. This gives e„(2£®)i = 1, or C' = +logZ»-|log(27r( ( ^VgenV dv? ) where D = / ¿V-Y........fdygeny Y ^2 dvh ) (f*r-y (^y......... that is? JD = (t)' (feY (558) (559) (560) (561) ¿r* y Now, by (553), we have for the first approximation H — %«,„ = (7 = J log D — | log (2jt©), (562) and if we divide by the constant K* to reduce these quantities to the usual unit of entropy, H — %en _ log D — h log (2ir@) K 2K (563) * See page 184-186.206 SYSTEMS COMPOSED OF MOLECULES. This is evidently a negligible quantity, since K is of the same order of magnitude as the number of molecules in ordinary bodies. It is to be observed that rjzen is here the average in the grand ensemble, whereas the quantity which we wish to compare with H is the average in a petit ensemble. But as we have seen that in the case considered the grand ensemble would appear to human observation as a petit ensemble, this distinction may be neglected. The differences therefore, in the case considered, between the quantities which may be represented by the notations * are not sensible to human faculties. The difference ^en|petit ^BPec|petifc = III.* * * and is therefore constant, so long as the numbers vx, ... vh are constant. For constant values of these numbers, therefore, it is immaterial whether we use the average of rjgen or of 97 for entropy, since this only affects the arbitrary constant of integration which is added to entropy, But when the numbers vv . . . vh are varied, it is no longer possible to use the index for specific phases. For the principle that the entropy of any body has an arbitrary additive constant is subject to limitation, when different quantities of the same substance are concerned. In this case, the constant being determined for one quantity of a substance, is thereby determined for all quantities of the same substance. To fix our ideas, let us suppose that we have two identical fluid masses in contiguous chambers. The entropy of the whole is equal to the sum of the entropies of the parts, and double that of one part. Suppose a valve is now opened, making a communication between the chambers. We do not regard this as making any change in the entropy, although the masses of gas or liquid diffuse into one another, and although the same process of diffusion would increase the * In this paragraph, for greater distinctness, figen lgrand and ^speo !petit have been written for the quantities which elsewhere are denoted by H and ij.SYSTEMS COMPOSED OF MOLECULES. 207 entropy, if the masses of fluid were different. It is evident, therefore, that it is equilibrium with respect to generic phases, and not with respect to specific, with which we have to do in the evaluation of entropy, and therefore, that we must use the average of H or of r)sen, and not that of 77, as the equivalent of entropy, except in the thermodynamics of bodies in which the number of molecules of the various kinds is constant.PART TWO DYNAMICS VECTOR ANALYSIS AND MULTIPLE ALGEBRA ELECTROMAGNETIC THEORY OF LIGHT ETC.PREFATORY NOTE TO PART TWO In a few cases slight corrections had been made by the author in his own copies of the papers. These changes, together with the correction of obvious misprints in the originals, have been incorporated in the present edition without comment. Where for the sake of clearness it has seemed desirable to insert a word or two in a footnote or in the text itself, the addition has been indicated by enclosing it within square brackets [ ], a sign which is otherwise used only in the formulae.CONTENTS OF PART TWO DYNAMICS. I. On the Fundamental Formulae of Dynamics, [Amer. Jour. Math., vol. n, pp. 49-64, 1879.] II. On the Fundamental Formula of Statistical Mechanics with Applications to Astronomy and Thermodynamics. (Abstract), -.............................. [Proc. Amer. Assoc., vol. xxxm, pp. 57, 58, 1884.] VECTOR ANALYSIS AND MULTIPLE ALGEBRA. III. Elements of Vector Analysis, Arranged for the Use of Students in Physics, ------ [Not published. Printed, New Haven, pp. 1-36, 1881; pp. 37-83, 1884.] IV. On Multiple Algebra. Vice-President’s Address before the American Association for the Advancement of Science, - - - - ........................... [.Proc. Amer. Assoc., vol. xxxv, pp. 37-66, 1886.] V. On the Determination of Elliptic Orbits from Three Complete Observations,............................... [Mem. Nat. Acad. Sci., vol. iv, part 2, pp. 79-104, 1889.] VI. On the Use of the Vector Method in the Determination of Orbits. Letter to the Editor of Klinkerfues’ “ Theoretische Astronomie,” -........................ [Hitherto unpublished.] VII. On the Pole of Quaternions in the Algebra of Vectors, [Nature, vol. xliii, pp. 511-513, 1891.] VIII. Quaternions and the “ Ausdehnungslehre,” -[Nature, vol. xliv, pp. 79-82, 1891.] IX. Quaternions and the Algebra of Vectors, [Nature, vol. xlvii, pp. 463, 464, 1893.] X. Quaternions and Vector Analysis,...................... [Nature, vol. xlviii, pp. 364-367, 1893.] PAGE 1 16 17 91 118 149 155 161 169 173VJ CONTENTS. THE ELECTROMAGNETIC THEORY OF LIGHT. PAGE XI. On Double Refraction and the Dispersion of Colors in Perfectly Transparent Media, - - - - 182 [Amer. Jour. Sei., ser 3, vol. xxiii, pp. 262-275, 1882.] XII. On Double Refraction in Perfectly Transparent Media which Exhibit the Phenomena of Circular Polarization, ...........................- 195 [Amer. Jour. Sei., ser. 3, vol. xxiii, pp. 460-476, 1882.] XIII. On the General Equations of Monochromatic Light in Media of Every Degree of Transparency, - - 211 [Amer. Jour. Sei., ser. 3, vol. xxv, pp. 107-118, 1883.] XIV. A Comparison of the Elastic and the Electrical Theories of Light with Respect to the Law of Double Refraction and the Dispersion of Conors, - 223 [Amer. Jour. Sei., ser. 3,-vol. xxxv, pp. 467-475, 1888.] XV. A Comparison of the Electric Theory of Light and Sir William Thomson's Theory of a Quasi-labile Ether,......................................... 232 [Amer. Jour. Sei., ser. 3, vol. xxxvii, pp. 129-144, 1889.] MISCELLANEOUS PAPERS. XVI. Reviews of Newcomb and Michelson’s “Velocity of Light in Air and Refracting Media ” and of Ketteler’s “Theoretische Optik,” - 247 [Amer. Jour. Sei., ser. 3, vol. xxxr, pp. 62-67, 1886.] XVII. On the Velocity of Light as Determined by Foucault’s Revolving Mirror,......................................253 [Nature, vol. xxxm, p. 582, 1886.] XVIII. Velocity of Propagation of Electrostatic Force, - 255 [Nature, vol. liii, p. 509, 1896.] XIX. Fourier’s Series,......................................258 [Nature, vol. lix, pp. 200 and 606, 1898-99.] XX. Rudolf Julius Emanuel Clausius, - - - - 261 [Proc. Amer. Aead., new series, vol. xvi, pp. 458-465, 1889.] XXI. Hubert Anson Newton,....................................268 [Amer. Jour. Sei., ser. 4, vol. in, pp. 359-376, 1897.]I. ON THE FUNDAMENTAL FORMULAE OF DYNAMICS. [.American Journal of Mathematics, voi. n. pp. 49-64, 1879.] Formation of a new Indeterminate Formula of Motion hy the Substitution of the Variations of the Components of Acceleration for the Variations of the Coordinates in the usual Formula. The laws of motion are frequently expressed by an equation of the form 2 [(X—mx) ox+( F— my) Sy+(Z— mz) Sz] = 0, (1) in which m denotes the mass of a particle of the system considered, x, y, z its rectangular coordinates, x, y, z the second differential coefficients of the coordinates with respect to the time, X, F, Z the components of the forces acting on the particle, Sx, Sy, Sz any arbitrary variations of the coordinates which are simultaneously possible, and 2 a summation with respect to all the particles of the system. It is evident that we may substitute for Sx, Sy, Sz any other expressions which are capable of the same and only of the same sets of simultaneous values. Now if the nature of the system is such that certain functions A, B, etc. of the coordinates must be constant, or given functions of the time, we have etc. These are the equations of condition, to which the variations in the general equation of motion (1) are subject. But if A is constant or a determined function of the time, the same must be true of A and A. Now and .• dA . dA . dA A v dA .. dA .. dA A , A-*\azx+zjv+-aa*)+H’2 ON THE FUNDAMENTAL FORMULÆ OF DYNAMICS. where H represents terms containing only the second differential coefficients of A with respect to the coordinates, and the first differential coefficients of the coordinates with respect to the time. Therefore, if we conceive of a variation affecting the accelerations of the particles at the time considered, but not their positions or velocities, we have «)-»• and, in like manner, /ox (3) etc. Comparing these equations with (2), we see that when the accelerations of the particles are regarded as subject to the variation denoted by S, but not their positions or velocities, the possible values of Sx, Sy, Sz are subject to precisely the same restrictions as the values of Sx, Sy, Sz, when the positions of the particles are regarded as variable. We may, therefore, write for the general equation of motion 2 [(X — mx) Sx+( F— my) Sy -}- {Z— mz) Sz] = 0, (4) regarding the positions and velocities of the particles as unaffected by the variation denoted by S,—a condition which may be expressed by the equations Sx = Q g Q &=0,1 * > I (g) &c = 0, Sy = 0, <5z = 0. J We have so far supposed that the conditions which restrict the possible motions of the systems may be expressed by equations between the coordinates alone or the coordinates and the time. To extend the formula of motion to cases in which the conditions are expressed by the characters ^ or we may write 2 [(X — mx) Sx+(F— my) Sy+(Z—mz) Sz] ^ 0. (6) The conditions which determine the possible values of Sx, Sy, Sz will not, in such cases, be entirely similar to those which determine the possible values of Sx, Sy, Sz, when the coordinates are regarded as variable. Nevertheless, the laws of motion are correctly expressed by the formula (6), while the formula 2 [(X—mx) Sx+(Y—my) Sy+(Z—mz) Sz]^ 0, (7) does not, as naturally interpreted, give so complete and accurate an expression of the laws of motion. This may be illustrated by a simple example. Let it be required to find the acceleration of a material point, which, at a given instant, is moving with given velocity on theON THE FUNDAMENTAL FORMULÆ OF DYNAMICS. 3 frictionless surface of a body (which it cannot penetrate, but which it may leave), and is acted on by given forces. For simplicity, we may suppose that the normal to the surface, drawn outward from the moving point at the moment considered, is parallel to the axis of X and in the positive direction. The only restriction on the values of Scc, Sy, Sz is that Sx>0. Formula (7) will therefore give X X — — Tïb r/Yi ° m m The condition that the point shall not penetrate the body gives another condition for the value of x. If the point remains upon the surface, x must have a certain value X, determined by the form of the surface and the velocity of the point. If the value of x is less than this, the point must penetrate the body. Therefore, x>N. But this does not suffice to determine the acceleration of the point. Let us now apply formula (6) to the same problem. Since\ x cannot be less than X, .£ .. ,r „^ if x = X, Scc^O. This is the only restriction on the value of Sx, for if x> X, the value of Sx is entirely arbitrary. Formula (6), therefore, requires that if x = X, x^= X. m7 X but if x > X, x = H V —that is (since x cannot be less than X), that x shall be equal to the X greater of the quantities X and —, or to both, if they are equal,— and that V = n m The values of d?, y, z are therefore entirely determined by this formula in connection with the conditions afforded by the constraints of the system.* The following considerations will show that what is true in this case is also true in general, when the conditions to which the system * The failure of the formula (7) in this case is rather apparent than real; for, although the formula apparently allows to x, at the instant considered, a value X exceeding both N and —, it does not allow this for any interval, however short. For m if x — ; — m orON THE FUNDAMENTAL FORMULAE OF DYNAMICS. 5 That is, if x =s 0, x has the greater of the values — and 0; otherwise, .. X m X = —. m ' In cases of this kind also, in which the function which cannot exceed a certain value involves the velocities (with or without the coordinates), one may easily convince himself that formula (6) is always valid, and always sufficient to determine the accelerations with the aid of the conditions afforded by the constraints of the system. But instead of examining such cases in detail, we shall proceed to consider the subject from a more general point of view. Comparison of the New Formula with the Statical Principle of Virtual Velocities.—Case of Discontinuous Changes of Velocity. Formula (1) has so far served as a point of departure. The general validity of this, the received form of the indeterminate equation of motion, being assumed, it has been shown that formula (6) will be valid and sufficient, even in cases in which both (1) and (7) fail. We now proceed to show that the statical principle of virtual velocities, when its real signification is carefully considered, leads directly to formula (6), or to an analogous formula for the determination of the discontinuous changes of velocity, when such occur. This will be the case even if we start with the usual analytical expression of the principle 2(X&c+YSy+ZSz)^ 0, (8) to which, at first sight, formula (6) appears less closely related than (7). For the variations of the coordinates in this formula must be regarded as relating to differences between the configuration which the system has at a certain time, and which it will continue to have in case of equilibrium, and some other configuration which the system might be supposed to have at some subsequent time. These temporal relations are not indicated explicitly in the notation, and should not be, since the statical problem does not involve the time in any quantitative manner. But in a dynamical problem, in which we take account of the time, it is hardly natural to use Sx, Sy, Sz in the same sense. In any problem in which x, y, z are regarded as functions of the time, Sx, Sy, Sz are naturally understood to relate to differences between the configuration which the system has at a certain time, and some other configuration which it might (conceivably) have had at that time instead of that which it actually had. Now when we suppose a point to have a certain position, specified by x, y, z, at a certain time, its position at that timé5 ig no longer a subject of hypothesis or of question. It is its future positions which6 ON THE FUNDAMENTAL FOKMULAE OF DYNAMICS. form the subject of inquiry. Its position in the immediate future is naturally specified by x+xdt+%xdt2+etc., y + ydt+^ydt2+etc., *z+zdt+^zdt2+etc., and we may regard the variations of these expressions as corresponding to the Sx, Sy, Sz of the statical problem. It is evidently sufficient to take account of the first term of these expressions of which the variation is not zero. Now, x, y, z} as has already been said, are to be regarded as constant. With respect to the terms containing x, y, z, two cases are to be distinguished, according as there is, or is not, a finite change of velocity at the instant considered. Let us first consider the most important case, in which there is no discontinuous change of velocity. In this case, x, y, z are not to be regarded as variable (by <5), and the variations of the above expressions are represented by JSxdt2, %Sydt2, %8zdt2, which are, therefore, to be substituted for Sx, Sy, Sz in the general formula of equilibrium (8) to adapt it to the conditions of a dynamical problem. By this substitution (in which the common factor %dt2 may of course be omitted), and the addition of the terms expressing the reaction against acceleration, we obtain formula (6). But if the circumstances are such that there is (or may be) a discontinuity in the values of x, y, z at the instant considered, it is necessary to distinguish the values of these expressions before and after the abrupt change. For this purpose, we may apply x, y, z to the original values, and denote the changed values by x+Ax, y+Ay, z+Az. The value of x at a time very shortly subsequent to the instant considered, will be expressed by x + (x+Ax)dt + etc., in which we may regard Ax as subject to the variation denoted by S. The variation of the expression is therefore S Ax dt. Instead of — mx, which expresses the reaction against acceleration, we need in the present case —mAx to express the reaction against the abrupt change of velocity. A reaction against such a change of velocity is, of course, to be regarded as infinite in intensity in comparison with reactions due to acceleration, and ordinary forces (such as cause acceleration) may be neglected in comparison. If, however, we conceive of the system as acted on by impulsive forces (i.e., such as have no finite duration, but are capable of producing finite changes of velocity, and are measured numerically by the discontinuities of velocity which they produce in the unit of mass), these forces should be combined with the reactions due to the discontinuities of velocity in the general formula which determines these discontinuities. If the impulsive forces are specified by X, Y, Z, the formula will be [(X — mAx) <5As6+(Y — mAy) ^Ay+(Z—mAz) dA£)] = 0. (9)ON THE FUNDAMENTAL FOBMULÆ OF DYNAMICS. 7 The reader will remark the strict analogy between this formula and (6), which would perhaps be more clearly exhibited if we should write ft for x, y, z in that formula. But these formulae may be established in a much more direct manner. For the formula (8), although for many purposes the most convenient expression of the principle of virtual velocities, is by no means the most convenient for our present purpose. As the usual name of the principle implies, it holds true of velocities as well as of displacements, and is perhaps more simple and more evident when thus applied* If we wish to apply the principle, thus understood, to a moving system so as to determine whether certain changes of velocity specified by Ait, Ay, Az are those which the system will really receive at a given instant, the velocities to be multiplied into the forces and reactions in the most simple application of the principle are manifestly such as may be imagined to be compounded with the assumed velocities, and are therefore properly specified by 8Ax, SAy, SAz. The formula (9) may therefore be regarded as the most direct application of the principle of virtual velocities to discontinuous changes of velocity in a moving system. In the case of a system in which there are no discontinuous changes of velocity, but which is subject to forces tending to produce accelerations, when we wish to determine whether certain accelerations, specified by x, y, z, are such as the system will really receive, it is evidently necessary to consider whether any possible variation of these accelerations is favored more than it is opposed by the forces * Even in Statics, the principle of virtual velocities, as distinguished from that of virtual displacements, has a certain advantage in respect of its evidence. The demonstration of the principle in the first section of the Mécanique Analytique, if velocities had been considered instead of displacements, would not have been exposed to an objection, which has been expressed by M. Bertrand in the following words: “On a objecté, avec raison, à cette assertion de Lagrange l’example d’un point pesant en équilibre au sommet le plus élevé d’une courbe ; il est évident qu’un déplacement infiniment petit le ferait descendre, et, pourtant, ce déplacement ne se produit pas.” {Mécanique Analytique, troisème édition, tome 1, page 22, note de M. Bertrand.) The value of z (the height of the point above a horizontal plane) can certainly be diminished by a displacement of the point, but the value of z is not affected by any velocity given to the point. The real difficulty in the consideration of displacements is that they are only possible at a time subsequent to that in which the system has the configuration to which the question of equilibrium relates. We may make the interval of time infinitely short, but it will always be difficult, in the establishing of fundamental principles, to treat a conception of this kind (relating to what is possible after an infinitesimal interval of time) with the same rigor as the idea of velocities or accelerations, which, in the cases to which (9) and (6) respectively relate, we may regard as communicated immediately to the system.8 ON THE FUNDAMENTAL FORMULÆ OF DYNAMICS. and reactions of the system. The formula (6) expresses a criterion of this kind in the most simple and direct manner. If we regard a force as a tendency to increase a quantity expressed by x, the product of the force by Sx is the natural measure of the extent to which this tendency is satisfied by an arbitrary variation of the accelerations. The principle expressed by the formula may not be very accurately designated by the words virtual velocities, but it certainly does not differ from the principle of virtual velocities (in the stricter sense of the term), more than this differs from that of virtual displacements,—a difference so slight that the distinction of the names is rarely insisted upon, and that it is often very difficult to tell which form of the principle is especially intended, even when the principle is enunciated or discussed somewhat at length. But, although the formulæ (6) and (9) differ so little from the ordinary formulæ, they not only have a marked advantage in respect of precision and accuracy, but also may be more satisfactory to the mind, in that the changes considered (to which S relates), are not so violently opposed to all the possibilities of the case as are those which are represented by the variations of the coordinates.* Moreover, as we shall see, they naturally lead to various important laws of motion. Transformation of the New Formula. Let us now consider some of the transformations of which our general formula (6) is capable. If we separate the terms containing * It may have seemed to some readers of the Mécanique Analytique—a work of which the unity of method is one of the most striking characteristics, and that to which its universally recognized artistic merit is in great measure due—that the treatment of dynamical problems in that work is not entirely analogous to the treatment of statical problems. The statical question, whether a system will remain in equilibrium in a given configuration, is determined by Lagrange by considering all possible motions of the system and inquiring whether there is any reason why the system should take any one of them. A similar method in dynamics would be based upon a comparison of a proposed motion with all other motions of which the system is capable without violating its kinematical conditions. Instead of this, Lagrange virtually reduces the dynamical problem to a statical one, and considers, not the possible variations of the proposed motion, but the motions which would be possible if the system were at rest. This reduction of a given problem to a simpler one, which has already been solved, is a method which has its advantages, but it is not the characteristic method of the Mécanique Analytique. That which most distinguishes the plan of this treatise from the usual type is the direct application of the general principle to each particular case. The point is perhaps of small moment, and may be differently regarded by others, but it is mentioned here because it was a feeling of this kind (whether justified or not) and the desire to express the formula of motion by means of a maximum or minimum condition, in which the conditions under which the maximum or minimum subsists should be such as the problem naturally affords (Gauss’s principle of least constraint being at the time unknown to the present writer, and the conditions under which the minimum subsists in the principle of least action being such that that is hardly satisfactory as a fundamental principle), which led to the formulæ proposed in this paper.ON THE FUNDAMENTAL FORMULAE OF DYNAMICS. 9 the masses of the particles from those which contain the forces, we have 2(XSx+ Yoy + ZSz) - 2 [£m')] - 2 [|m{(«+x'f+(y+y'f+(z+S')2} ] -$(Pp)+2[lm(xZ+f+z*)} =$(Pp')-I,[m(xx'+yy'+zz')] — 2 [\m (*'2+y'2+S'2)]. But since p', x\ y\ z are proportional to and of the same sign with possible values of Sp, Sx, Sy, Sz, we have, by the general formula of motion, £(Pp') _ 2 [m(xx’+yy'+zz')] ± 0. The second member of the preceding equation is therefore negative. The first member is therefore negative, which proves the proposition with respect to (15). The demonstration is precisely the same with respect to (13) and (14), which may be regarded, as particular cases of (15). To show the sarnie with regard to (16) and (17), we have only to observe that the quantities affected by S in these formulae differ from those affected by the same symbol in (14) and (15) only by the terms ¿¿(Xx+Yy+Zz) and $(Pp), which will not be affected by any change in the accelerations of the system. When the forces are determined by the configuration (with or without the time), the principle may be enunciated as follows: The accelerations in the system are always such that the acceleration of the rate of work done by the forces diminished by one-half the sum of the products of the masses of the particles by the squares of their accelerations has the greatest possible value. The formula (17), although in appearance less simple than (15), not only is more easily enunciated in words, but has the advantage that d the quantity ^ $(Pp) is entirely determined by the system with its forces and motions, which is not the case with £(Pp)- The value of the latter expression depends upon the manner in which we choose to represent the forces. For example, if a material point is revolving in a circle under the influence of a central force, we may write either Xx + Yy + Zz or Rr for Pp, R and r denoting respectively the force and radius vector. Now Xx-\- Yij+Zz is manifestly unequal to Rf. But Xx+Yy + Zz is equal to Rr, and ~_{Xx+ Yy+Z%) is equal to d dt (Rr). It may not be without interest to see what shape our general formulae will take in one of the most important cases of forces dependent upon the velocities. If a body which can be treated as a point is moving in a medium which presents a resistance expressed byON THE FUNDAMENTAL FORMULAE OF DYNAMICS. 11 any function of the velocity, the terms due to that resistance in the general formula of motion may be expressed in the form s \j> 0) \ *+0 0) f v+> where v denotes the velocity and {v) the resistance. But xx , yii , zz dv ---h—H— = -jt — v. v v v dt The terms due to the resistance reduce, therefore, to or, S§tf(v), where / denotes the primitive of the function denoted by 0. Discontinuous Changes of Velocity.—Formula (9), which relates to discontinuous changes of velocity, is capable of similar transformations. If we set = +¿^2 + Ai2, the formula reduces to <52(XA# +YAy + ZAi — %mw2)^i0, (18) where X, Y, Z are to be regarded as constant. If ^(Pdp) represents the sum of the moments of the impulsive forces, and we regard P as constant, we have S[»(¥Ap)^^mw2)] ^ 0. (19) The expressions affected by o in these formulae have a greater value than they would receive from any other changes of velocity consistent with the constraints of the system. Deduction of other Properties of Motion. The principles which have been established furnish a convenient point of departure for the demonstration of various properties of motion relating to maxima and minima. We may obtain several such properties by considering how the accelerations of a system, at a given instant, will be modified by changes of the forces or of the constraints to which the system is subject. Let us suppose that the forces X, Y, Z of a system receive the increments X', Y', Z'} in consequence of which, and of certain additional constraints, which do not produce any discontinuity in the velocities, the components of acceleration x, ij, z receive the increments x', y\ z'. The expression 2[(X+X')(x +x)+{Y+ Y')(y+y')+(Z+Z')(z+z) -{(x+xf+(y+ij'f+(2 + z'f}] (20) will be the greatest possible for any values of x\ y', z' consistent with the constraints. But this expression may be divided into three parts, 2[(X+X')xHY+Y')y+(Z+Z')z-im(x2+f+z2)l (21) ^[Xx,Jt Yy'+Zz' — m(xx'+ijy'+zz')], (22) 2,[X'x'+ Ytf+Z'z'-^m^+y^+z'2)]. (23) and12 ON THE FUNDAMENTAL FORMULAE OF DYNAMICS. The first part is evidently constant with reference to variations of x, y\ z, and may, therefore, be neglected. With respect to the second part we observe that by the general formula of the motion we have 2 [ XSx + YSy4-ZSz — m(ocSx+ijSij+zSz) ] = 0 for all values of Sx, Sij, Sz which are possible and reversible before the addition of the new constraints. But values proportional to x, ij', z, and of the same sign, are evidently consistent with the original constraints, and when the components of acceleration are altered to x+x', ij+ij', ¿4-2', variations of these quantities proportional to and of the same sign as —a;', —ij', are evidently consistent with the original constraints. Now if these latter variations were not possible before the accelerations were modified by the addition of the new forces and constraints, it must be that some constraint was then operative which afterwards ceased to be so. The expression (22) will, therefore, be equal to zero, provided only that all the constraints which were operative before the addition of the new forces and constraints, remain operative afterwards.* With this limitation, therefore, the expression (23) must have the greatest value consistent with the constraints. This principle may be expressed without reference to rectangular coordinates. If we write u' for the relative acceleration due to the additional forces and constraints, we have u'2 = x'2+ij'2 4- 2'2, and expression (23) reduces to 2(X/#+ Y'y'+Z'z'-imu'2). (24) If the sum of the moments of the additional forces which are considered is represented by $(Qdq) (the q representing quantities determined by the configuration of the system), we have 2(X'x+Y'y+Z'z) = $(Qq). We may distinguish the values of immediately before and immediately after the application of the additional forces and constraints by the expressions q and q + q'> With this understanding, we have, by differentiation of the preceding equation, Y* y-irZ'z ~\~X' (d?4•^/)4- Y (ij -\-ij')-i-Z' (¿4-£')] =£LQ4+Q(q+?')]; * As an illustration of the significance of this limitation, we may consider the condition afforded by the impenetrability of two bodies in contact. Let us suppose that if subject only to the original forces and constraints they would continue in contact, but that, under the influence of the additional* forces and constraints, the contact will cease. The impenetrability of the bodies then ceases to be operative as a constraint. Such cases form an exception to the principle which is to be established. But there are no exceptions when all the original constraints are expressed by equations.ON THE FUNDAMENTAL FORMULAE OF DYNAMICS. 13 whence it appears that 2(X'x'+ Y'y'+ Z'z') differs from £6(Q'q') only by quantities which are independent of the relative acceleration due to the additional forces and constraints. It follows that these relative accelerations are such as to make ^(Qr)-2(|mu'2) (25) a maximum. It will be observed that the condition which determines these relative accelerations is of precisely the same form as that which determines absolute accelerations. An important case is that in which new constraints are added but no new forces. The relative accelerations are determined in this case by the condition that 2 (¿mu'2) is a minimum. In any case of motion, in which finite forces do not act at points, lines or surfaces, we may first calculate the accelerations which would be produced if there were no constraints, and then determine the relative accelerations due to the constraints by the condition that I(|mw'2) is a minimum. This is Gauss's principle of least constraint.* Again, in any case of motion, we may suppose u to denote the acceleration which would be produced by the constraints alone, and v! the relative acceleration produced by the forces; we then have 2 [m {xx'+yy'+&#) ] = 0, whence, if we write u" for the resultant or actual acceleration, 2 (¿mu2) + 2 (¿mu'2) = 2 (¿mu"2). Moreover, differentiating (25), we obtain »(QSq') - 2 [m (x'Sx'+y'Sij'+z'Sz')] = 0, whence, since Sq\ Sx', Sy', Sz' may have values proportional to q, x', M $W) = These relations are similar to those which exist with respect to vis viva and impulsive forces. Particular Equations of Motion. From the general formula (12), we may easily obtain particular equations which will express the laws of motion in a very general form. Let doolt dco2> e^c* infinitesimals (not necessarily complete differentials) the values of which are independent, and by means * This principle may be derived very directly from the general formula (6), or vice versa, for 2 {\mu'2) may be put in the form the variation of which, with the sign changed, is identical with the first member of (6).14 ON THE FUNDAMENTAL FOKMULÆ OF DYNAMICS. of which we can perfectly define any infinitesimal change in the configuration of the system ; and let d(û9 dt ’ etc., where dwlf dco2 are to be determined by the change in the configuration in the interval of time dt ; and let ## d(jO-t a>i — ~Tr > oj9 = - etc. dt’ dt Also let U= 2 (|mu2). It is evident that U can be expressed in terms of 2, etc., and the quantities which express the configuration of the system, and that (since S is used to denote a variation which does not affect the configuration or the velocities), su=dUs..+dUs.h+etc Moreover, since the quantities p in the general formula are entirely determined by the configuration of the system . dp . dp , p=3^“1+3^ft’2+etc-’ where - denotes the ratio of simultaneous values of dp and dm-,, dmy d when dm2, etc. are equal to zero, and etc. are to be interpreted on the same principle. Multiplying by P, and taking the sum with respect to the several forces, we have $(JPp) — iljWj + i^2c*,2 “b etc., elc' If we differentiate with respect to t, and take the variation denoted by 5, we obtain ^^^+0^+^ The general formula (12) is thus reduced to the form dU\ . ir, dU\ icO± ) v ujuj2 / If the forces have à potential V, we may write (dV dU\9 t(dV dU\, . Vdw, da5, ) Sü>1+\dm2 dwj S<°2 + &t ’ where denotes the ratio of dV and dw± when dœ2, etc. have the value zero, and the analogous expressions are to be interpreted on the same principle. («.-=) 5s-+(a'-aDte‘+ifcs0' (26) (27) dVON THE FUNDAMENTAL FORMULA OF DYNAMICS. 15 If the variations ck*^, Sco2, etc. are capable both of positive and of negative values, we must have cZCT ~ dU ~ , s«r01’ ^~°2i etCv (28) or, dU_dV dU_dV C?ó52 (29) To illustrate the use of these equations in a case in which dw1, doo2, etc. are not exact differentials, we may apply them to the problem of the rotation of a rigid body of which one point is fixed. If dwv dw2, df 4* ft*!)2 — (cbf+|) (2+(oi'+6)w3 ; and the equations of motion are „ (c — 6) (b2w 3 + Oi c+h ’ (a — c)ci)3ci>i + f22 C02 =-------;------1 a+c .. (6 —a)(»ic»2+Q3 ^--------b+d-------•IL ON THE FUNDAMENTAL FORMULA OF STATISTICAL MECHANICS, WITH APPLICATIONS TO ASTRONOMY AND THERMODYNAMICS. [Proceedings of the American Association for the Advancement of Science, vol. XXXIII. pp. 57, 58, 1884.] (ABSTRACT.) Suppose that we have a great number of systems which consist of material points and are identical.in character, but different in configuration and velocities, and in which the forces are determined by the configuration alone. Let the number of systems in which the coordinates and velocities lie severally between the following limits, viz., between xx and x1+dx1, Vi and yx+dylt zx and z1 + dz1,^ x2 and x2+dx2, etc., xx and xx +dxx, yx and yx+dyv zx and zx + dz1} x2 and x2+dx2i etc., be denoted by L dx1 dyx dz1 dx2 etc. dxx dyx dzx dx2 etc. The manner in which the quantity L varies with the time is given by the equation dt Ldx ^ dx where t, xx> yL, %, x2, etc., xx, ÿl5 zXi x2, etc., are the independent variables, and the summation relates to all the coordinates. The object of the paper is to establish this proposition (which is not claimed as new, but which has hardly received the recognition which it deserves) and to show its applications to astronomy and thermodynamics.III. ELEMENTS OF VECTOR ANALYSIS. [Privately printed, New Haven, pp. 17-50, 1881; pp. 50-90, 1884.] (The fundamental principles of the following analysis are such as are familiar under a slightly different form to students of quaternions. The manner in which the subject is developed is somewhat different from that followed in treatises on quaternions, since the object of the writer does not require any use of the conception of the quaternion, being simply to give a suitable notation for those relations between vectors, or between vectors and scalars, which seem most important, and which lend themselves most readily to analytical transformations, and to explain some of these transformations. As-~a precedent for such a departure from quatemionic usage, Clifford’s Kinematic may be cited. In this connection, the name of Grassmann may also be mentioned, to whose system the following method attaches itself in some respects more closely than to that of Hamilton.) CHAPITER I. CONCERNING THE ALGEBRA OF VECTORS. Fundamental Notions. 1. Definition.—If anything has magnitude and direction, its magnitude and direction taken together constitute what is called a vector- The numerical description of a vector requires three numbers, but nothing prevents us from using a single letter for its symbolical designation. An algebra or analytical method in which a single letter or other expression is- used to specify a vector may be called a vector algebra or vector analysis. Def.—As distinguished from vectors the real (positive or negative) quantities of ordinary algebra are called scalars* As it is convenient that the form of the letter should indicate whether a vector or a scalar is denoted, we shall use the small Greek letters to denote vectors, and the small English letters to denote scalars. (The three letters, i, j, k, will make an exception, to be mentioned more particularly hereafter. Moreover, ir will be used in its usual scalar sense, to denote the ratio of the circumference of a circle to its diameter.) * The imaginaries of ordinary algebra may be called biscalars, and that which corresponds to them in the theory of vectors, bivectors. But we shall have no occasion to consider either of these. [See, however, footnote on p. 84.]18 VECTOR ANALYSIS. 2. Def.—Vectors are said to be equal when they are the same both in direction and in magnitude. This equality is denoted by the ordinary sign, as a = /3. The reader will observe that this vector equation is the equivalent of three scalar equations. A vector is said to be equal to zero, when its magnitude is zero. Such vectors may be set equal to one another, irrespectively of any considerations relating to direction. 3. Perhaps the most simple example of a vector is afforded by a directed straight line, as the line drawn from A to B. We may use the notation AB to denote this line as a vector, i.e., to denote its length and direction without regard to its position in other respects. The points A and B may be distinguished as the origin and the terminus of the vector. Since any magnitude may be represented by a length, any vector may be represented by a directed line; and it will often be convenient to use language relating to vectors, which refers to them as thus represented. Reversal of Direction, Scalar Multiplication and Division. 4. The negative sign ( — ) reverses the direction of a vector. (Sometimes the sign + may be used to call attention to the fact that the vector has not the negative sign.) Def.—A vector is said to be multiplied or divided by a scalar when its magnitude is multiplied or divided by the numerical value of the scalar and its direction is either unchanged or reversed according as the scalar is positive or negative. These operations are represented by the same methods as multiplication and division in algebra, and are to be regarded as substantially identical with them. The terms scalar multiplication and scalar division are used to denote multiplication and division by scalars, whether the quantity multiplied or divided is a scalar or a vector. 5. Def—A unit vector is a vector of which the magnitude is unity. Any vector may be regarded as the product of a positive scalar (the magnitude of the vector) and a unit vector. The notation a0 may be used to denote the magnitude of the vector a. Addition and Subtraction of Vectors. 6. Def.—The sum of the vectors a, /3, etc. (written a+/3+etc.) is the vector found by the following process. Assuming any point A, we determine successively the points B, C, etc., so that AB = a, BC = ¡3, etc. The vector drawn from A to the last point thus determined is the sum required. This is sometimes called the geometrical sum, to distinguish it from an algebraic sum or an arithmetical sum. It is also called the resultant, and a, /3, etc. are called the components.VECTOR ANALYSIS. 19 When the vectors to be added are all parallel to the same straight line, geometrical addition reduces to algebraic; when they have all the same direction, geometrical addition like algebraic reduces to arithmetical. It may easily be shown that the value of a sum is not affected by changing the order of two consecutive terms, and therefore that it is not affected by any change in the order of the terms. Again, it is evident from the definition that the value of a sum is not altered by uniting any of its terms in brackets, as a + [/3+y] + etc., which is in effect to substitute the sum of the terms enclosed for the terms themselves among the vectors to be added. In other words, the commutative and associative principles of arithmetical and algebraic addition hold true of geometrical addition. 7. Bef—A vector is said to be subtracted when it is added after reversal of direction. This is indicated by the use of the sign — instead of +. 8. It is easily shown that the distributive principle of arithmetical and algebraic multiplication applies to the multiplication of sums of vectors by scalars or sums of scalars, i.e., (m -f n + etc.) [a+/34- etc.] = ma+na + etc. +m/3+w/3-f-ete. +etc. 9. Vector Equations.—If we have equations between sums and differences of vectors, we may transpose terms in them, multiply or divide by any scalar, and add or subtract the equations, precisely as in the case of the equations of ordinary algebra. Hence, if we have several such equations containing known and unknown vectors, the processes of elimination and reduction by which the unknown vectors may be expressed in terms of the known are precisely the same, and subject to the same limitations, as if the letters representing vectors represented scalars. This will be evident if we consider "that in the multiplications incident to elimination in the supposed scalar equations the multipliers are the coefficients of the unknown quantities, or functions of these coefficients, and that such multiplications may be applied to the vector equations, since the coefficients are scalars. 10. Linear relation of four vectors, Coordinates.—If a, /3, and y are any given vectors not parallel to the same plane, any other vector p may be expressed in the form p = cta + bf3 + cy. If a, f3, and y are unit vectors, a, b, and c are the ordinary scalar components of p parallel to a, ¡3, and y. If p = OP, (a, ¡3, y being unit vectors), a, 6, and c are the cartesian coordinates of the point P referred to axes through 0 parallel to a, /?, and y. When the values of these scalars are given, p is said to be given in terms of a, ¡3, and y.20 VECTOR ANALYSIS. It is generally in this way that the value of a vector is specified, viz., in terms of three known vectors. For such purposes of reference, a system of three mutually perpendicular vectors has certain evident advantages. 11. Normal systems of unit vectors.—The letters i,j, k are appropriated to the designation of a normal system of unit vectors, i.e., three unit vectors, each of which is at right angles to the other two and determined in direction by them in a perfectly definite manner. We shall always suppose that k is on the side of the i-j plane on which a rotation from i to j (through one right angle) appears counter-clockwise. In other words, the directions of i, j, and k are to be so determined that if they be turned (remaining rigidly connected with each other) so that i points to the east, and j to the north, k will point upward. When rectangular axes of X, Y, and Z are employed, their directions will be conformed to a similar condition, and i, j, k (when the contrary is not stated) will be supposed parallel to these axes respectively. We may have occasion to use more than one such system of unit vectors, just as we may use more than one system of coordinate axes. In such cases, the different systems may be distinguished by accents or otherwise. 12. Numerical computation of a geometrical sum.—If p — aa + bj& -f- Cy, cr — a'a -{- b'¡3-}-cy, etc., then p+cr+etc. = (a -j- a -1-etc.) a -f- (b -f- b' -f- etc.) ¡3 -1- (c -f- c' -f- etc.) y, i.e., the coefficients by which a geometrical sum is expressed in terms of three vectors are the sums of the coefficients by which the separate terms of the geometrical sum are expressed in terms of the same three vectors. Direct and Skew Products of Vectors. 13. Def—The direct product of a and /3 (written a./3) is the scalar quantity obtained by multiplying the product of their magnitudes by the cosine of the angle made by their directions. 14. Def.—The skew product of a and (3 (written ax/3) is a vector function of a and /3. Its magnitude is obtained by multiplying the product of the magnitudes of a and ¡3 by the sine of the angle made by their directions. Its direction is at right angles to a and ¡3, and on that side of the plane containing a and /3 (supposed drawn from a common origin) on which a rotation from a to ¡3 through an arc of less than 180° appears counter-clockwise. The direction of ax(3 may also be defined as that in which an ordinary screw advances as it turns so as to carry a toward /3.VECTOR ANALYSIS. 21 Again, if a be directed toward the east, and ¡3 lie in the same horizontal plane and on the north side of a, ax/3 will be directed upward. 15. It is evident from the preceding definitions that a./3 = /3.a, and aX/3= — /3xa. 16. Moreover, [na].f3 = a.[7i/3]==7i[a./3], and [na\ x/3=ax [n/3] — n[ a X /?]. The brackets may therefore be omitted in such expressions. 17. From the definitions of No. 11 it appears that i.j = — 0, ixi = 0, jxj = 0, kxk = 0, ixj=k, jxk = i, kxi=j, jxi = — k} kxj = — i, -ixrA? = — j. 18. If we resolve ¡3 into twTo components ¡8' and ¡3", of which the first is parallel and the second perpendicular to a, we shall have a./3 = a./3' and aX/3 = aX/3". 19. a.[/3+y] = a./3 + a.y and ax[/3+y] = aX/3 + aXy. To prove this, let [a-|-/3]. [y+<5] = a.y+a.<5-|-/3*y+/3.<5, [a-f/3] X [y + and y (supposed drawn from a common origin) are the edges, and that the value of bhe expression is positive or negative according as y lies on the side of the plane of a and (3 on which the rotation from a to /3 appears counter-clockwise, or on the opposite side. 24 Hence, ax/3.y = /3xy.a = yXa./3 = y.aX/3 = a./3xy = /3.yXa— — f3xa.y = — yx(3.a— —aXy-(3 = — y./3xa= — a.yx/3= —/3.aXy. It will be observed that all the products of this type, which can be made with three given vectors, are the same in numerical value, and * Since the sign x is only used between vectors, the skew multiplication in expressions of this kind is evidently to be performed first. In other words, the above expression must be interpreted as [ax/3] .7.VECTOR ANALYSIS. 23 that any two such products are of the same or opposite character in respect to sign, according as the cyclic order of the letters is the same or different. The product vanishes when two of the vectors are parallel to the same line, or when the three are parallel to the same plane. This kind of product may be called the scalar product of the three vectors. There are two other kinds of products of three vectors, both of which are vectors, viz., products of the type (a./3)y or y(a. (3), and products of the type ax[/3xy] or [yx/3]xa. 25. i.jxk==j.kxi = k.ixj = 1. i.kxj = k.jxi=j.ixk = — 1. From these equations, which follow immediately from those of No. 17, the propositions of the last section might have been derived, viz., by substituting for a, (3, and y, respectively, expressions of the form xi+yj+zk, x'i+y'j+z% and xi -f- y"j+z"k.* Such a method, which may be called expansion in terms of. i, j, and k, will on many occasions afford very simple, although perhaps lengthy, demonstrations. 26. Triple products containing only two different letters.—The significance and the relations of (a.a)¡3, (a.¡3)a, and ax[aX/3] will be most evident, if we consider ¡3 as made up of two components, ¡3' and /3", respectively parallel and perpendicular to a. Then |8=/8'+/3", (a. /3) a = (a. /3') a = (a. a)f3\ ax[aXf3] = aX[aX/3fT] = — (a.a)/3". Hence, ax[aX/3] = (a./3)a —(a.a)/3. 27. General relation of the vector products of three factors.—In the triple product ax[/3x y] we may set a = l/3+my-\-n/3xy, unless and y have the same direction. Then ax[/3xy] = £/3x[/#xy]+myx[/8xy] = ¿(/8.y)/3 - i()8.)8)y - m(y.)8) y+m (y. y)/3 = (1/3 .y+my .y )/8 - (1/3. ¡3+my. /3) y. Rut £/3.y+my.y = a.y, and Z/3./3+my./3 = a.j8. Therefore ax[/3xy] = (a.y)/3 —(a./3)y, which is evidently true, when ¡3 and y have the same directions. It may also be written [yx£]Xa = /3(y.a)-y(£.a). * The student who is familiar with the nature of determinants will not fail to observe that the triple product a.fixy is the determinant formed by thé nine rectangular components of a, /3, and 7, nor that the rectangular components of ax/3 are determinants of the second order formed from the components of a and /3. (See the last equation of No. 21.)24 YECTOE ANALYSIS. 28. This principle may be used in the transformation of more complex products. It will be observed that its application will always simultaneously eliminate, or introduce, two signs of skew multiplication. The student will easily prove the following identical equations, which, although of considerable importance, are here given principally as exercises in the application of the preceding formulae. 29. ax[/8xy]+/3x[yXa] + yx[aX/3] = 0. 30. [a x/3]. [y x<5]=(a •yX/S. 8)—(a. <5)(/3*y). 31. [aX/3]x[yX$] = (a.yXry .o*) +/3./)(y.<7 a.T — y.r a. (2) which is the solution required.28 VECTOR ANALYSIS. It results from the principle stated in No. 85, that any vector equation of the first degree with respect to p may be reduced to the ^orm S = a(\.p)-\-/3(ti-p)+y(v.p)+ap + €Xp. But ap = aX(\.p) bay'(jm.p)+av(v.p)} and €Xp = €'xX(\'p)-\-€Xfi(/uL.p)-\-eXv(v'p)} where A', //, v represent, as before, the reciprocals of A, /jl, v. By substitution of these values the equation is reduced to the form of equation (1), which may therefore be regarded as the most general form of a vector equation of the first degree with respect to p. 41. Relations between two normal systems of unit vectors.—If i, j, k, and i', j', k' are two normal systems of unit vectors, we have i' = (i.i')i+(j.i')j+(k.i')k, "J = +(k.j')k, V (1) k'=(i.k’)i+(j.k')j+(k.k')k, ) i = (i.i')i'+(i.j')j'+(i.k')k'A (2) k=(k.i')i‘+(k.j')j'+(k.k')k'. J (See equation (8) of No. 38.) The nine coefficients in these equations are evidently the cosines of the nine angles made by a vector of one system with a vector of the other system. The principal relations of these cosines are easily deduced. By direct multiplication of each of the preceding equations with itself, we obtain six equations of the type (i. i')2+( j. i')2 + (k. i')2 = 1. (3) By direct multiplication of equations (1) with each other, and of equations (2) with each other, we obtain six of the type (i.i') (j.j')+(k.i') (k.j') = 0. (4) By skew multiplication of equations (1) with each other, we obtain three of the type k' = {(j.i%k.j')-(kA')(j.j')}i+ {(k.i')(i.j')-(i.i')(k.j')}j + {(i-i')(? •/) - k. Comparing these three equations with the original three, we obtain nine of the type i.M = (j.i')(k.f) - (k.i')(j.j'). (5) Finally, if we equate the scalar product of the three right hand members of (1) with that of the three left hand members, we obtain ^■i')(j-j')(k.k')+('i.j')(j.kr)(k. %')+(i.lc){j.ij(k.j') Equations (1) and (2) (if the expressions in the parentheses are supposed replaced by numerical values) represent the linear relationsVECTOR ANALYSIS. 29 which subsist between one vector of one system and the three vectors of the other system. If we desire to express the similar relations which subsist between two vectors of one system and two of the other, we may take the skew products of equations (1) with equations (2), after transposing all terms in the latter. This will afford nine equations of the type (i.j')k'-(i.Jc')j'=(k.i')j-(j.i')h (7) We may divide an equation by an indeterminate direct factor. [MS. note by author.] CHAPTER II. CONCERNING THE DIFFERENTIAL AND INTEGRAL CALCULUS OF VECTORS. 42. Differentials of vectors.—The differential of a vector is the geometrical difference of two values of that vector which differ infinitely little. It is itself a vector, and may make any angle with the vector differentiated. It is expressed by the same sign (d) as the differentials of ordinary analysis. With reference to any fixed axes, the components of the differential of a vector are manifestly equal to the differentials of the components of the vector, i.e., if a, /3, and y are fixed unit vectors, and p = xa + y(3-\-z y, dp = dxa + dy ft+dzy. 43. Differential of a function of several variables.—The differential of a vector or scalar function of any number of vector or scalar variables is evidently the sum (geometrical or algebraic, according as the function is vector or scalar) of the differentials of the function due to the separate variation of the several variables. 44. Differential of a product.—The differential of a product of any kind due to the variation of a single factor is obtained by prefixing the sign of differentiation to that factor in the product. This is evidently true of differentials, since it wilhhold true even of finite differences. 45. From these principles we obtain the following identical equations: cZ(a+^) = <^a + cZ/3, (1) d(na) = dna+nda, (2) <2(a./3) = c?a./3+a.c£/3, (3) d[a'Xf3] = daxfi-raxdfti (4) d(a.f3xy) = da./3xy + a.df3xy + a./3xdy, (5) d[(a./3)y] = (da./3)y+(a.d/3)y+(a./3)dy. (6)30 VECTOR ANALYSIS. 46. Differential coefficient with respect to a scalar.—The quotient obtained by dividing the differential of a vector due to the variation of any scalar of which it is a function by the differential of that scalar is called the differential coefficient of the vector with respect to the scalar, and is indicated in the same manner as the differential coefficients of ordinary analysis. If we suppose the quantities occurring in the six equations of the last section to be functions of a scalar t, we may substitute ^ for d in those equations since this is only to divide all terms by the scalar dt. 47. Successive differentiations.■—The differential coefficient of a vector with respect to a scalar is of course a finite vector, of which we may take the differential, or the differential coefficient with respect to the same or any other scalar. We thus obtain differential coefficients of the higher orders, which are indicated as in the scalar calculus. A few examples will serve for illustration. If p is the vector drawn from a fixed origin to a moving point at any time t, ^ will be the vector representing the velocity of the point, and æp dt2 the vector representing its acceleration. If P is the vector drawn from a fixed origin to any point on a curve, and s the distance of that point measured on the curve from any fixed point, ^ is a unit vector, tangent to the curve and having the direction in which s increases; is a vector directed from a point on the curve to the center of curvature, and equal to the curvature; X is the normal to the osculating plane, directed to the side on which the curve appears described counter-clockwise about the center of curvature, and equal to the curvature. The tortuosity (or rate of rotation of the osculating plane, considered as positive when the rotation appears counter-clockwise as seen from the direction in which s increases) is represented by dp d2p dzp ds ds2 ds3 d2p d2p ds2 ds2 48. Integration of an equation between differentials.—If t and u are two single-valued continuous scalar functions of any number of scalar or vector variables, and dt = du, then t = u+a, where a is a scalar constant.VECTOR ANALYSIS. 31 Or, if t and w are two single-valued continuous vector functions of any number of scalar or vector variables, and dr = dw, then t=gH-a, where a is a vector constant. When the above hypotheses are not satisfied in general, but will be satisfied if the variations of the independent variables are confined within certain limits, then the conclusions will hold within those limits, provided that we can pass by continuous variation of the independent variables from any values within the limits to any other values within them, without transgressing the limits. 49. So far, it will be observed, all operations have been entirely analogous to those of the ordinary calculus. jFunctions of Position in Space. 50. De/.—If u is any scalar function of position in space (i.e., any scalar quantity having continuously varying values in space), Vu is the vector function of position in space which has everywhere the direction of the most rapid increase of u, and a magnitude equal to the rate of that increase per unit of length. Vu may be called the derivative of u, and u, the primitive of Vu. We may also take any one of the Nos. 51, 52, 53 for the definition of Vu. 51. If p is the vector defining the position of a point in space, du^Vu.dp. .du , .du , t du 'w ' 52. Vu^i^+j^+k- 53. dy dz' du . — du ~ du , ~ -=^.Vu, -£-=?.Vu, —=fc.Vu. dx v’T w* dy J'Y w> dz' 54. Def—If <*> is a vector having continuously varying values in space, a) r* . dw , . dw , 7 dw v-«=*-aE+*3y+*-S’ and — . dw , . dw . 7 dw (2) V.co is called the divergence of w and Vxg> its curl. If we set w=Xi+Yj+Zk, we obtain by substitution the equations _ dX , dY , dZ ,te dx dy de . „ ./dZ dY\,./dX. dZ\ ,/ciY dX\ and which may also be regarded as defining V.w and Vxw.32 VECTOR ANALYSIS. 55. Surface-integrals.—The integral ffoo.dcr, in which da represents an element of some surface, is called the surf ace-integral of co for that surface. It is understood here and elsewhere, when a vector is said to represent a plane surface (or an element of surface which may be regarded as plane), that the magnitude of the vector represents the area of the surface, and that the direction of the vector represents that of the normal drawn toward the positive side of the surface. When the surface is defined as the boundary of a certain space, the outside of the surface is regarded as positive. The surface-integral of any given space (i.e., the surface-integral of the surface bounding that space) is evidently equal to the sum of the surface-integrals of all the parts into which the original space may be divided. For the integrals relating to the surfaces dividing the parts will evidently cancel in such a sum. The surface-integral of oo for a closed surface bounding a space dv infinitely small in all its dimensions is V. a) dv. This follows immediately from the definition of Vco, when the space is a parallelopiped bounded by planes perpendicular to i, f k. In other cases, we may imagine the space—or rather a space nearly coincident with the given space and of the same volume dv—to be divided up into such parallel opipeds. The surface-integral for the space made up of the parallelopipeds will be the sum of the surface-integrals of all the parallelopipeds, and will therefore be expressed by V. w dv. The surface-integral of the original space will have sensibly the same value, and will therefore be represented by the same formula. It follows that the value of V. w does not depend upon the system of unit vectors employed in its definition. It is possible to attribute such a physical signification to the quantities concerned in the above proposition, as shall make it evident almost without demonstration. Let us suppose w to represent a flux of any substance. The rate of decrease of the density of that substance at any point will be obtained by dividing the surface-integral of the flux for any infinitely small closed surface about the point by the volume enclosed. This quotient must therefore be independent of the form of the surface. We may define V.a> as representing that quotient, and then obtain equation (1) of No. 54 by applying the general principle to the case of the rectangular parallelopiped. 56. Skew surface-integrals.—The integral ff darxw may be called the skew surface-integral of w. It is evidently a vector. For a closed surface bounding a space dv infinitely small in all dimensions this integral reduces to Vxw dv, as is easily shown by reasoning like that of No. 55.VECTOR ANALYSIS. 83 57. Integration.—If dv represents an element of any space, and dcr an element of the bounding surface, JffV.tedv =ffa>.dtr. For the first member of this equation represents the sum of the surface-integrals of all the elements of the given space. We may regard this principle as affording a means of integration, since we may use it to reduce a triple integral (of a certain form) to a double integral. The principle may also be expressed as follows: The surface-integral of any vector function of position in space for a closed surface is equal to the volume-integral of the divergence of that function for the space enclosed. 58. Line-integrals— The integral fw.dp, in which dp denotes the element of a line, is called the line-integral of w for that line. It is implied that one of the directions of the line is distinguished as positive. When the line is regarded as bounding a surface, that side of the surface will always be regarded as positive, on which the surface appears to be circumscribed counter-clockwise. 59. Integration.—From No. 51 we obtain directly fVu.dp-u" — u\ where the single and double accents distinguish the values relating to the beginning and end of the line. In other words,—The line-integral of the derivative of any (continuous and single-valued) scalar function of position in space is equal to the difference of the values of the function at the extremities of the line. For a closed line the integral vanishes. 60. Integration.—The following principle may be used to reduce double integrals of a certain form to simple integrals. If da- represents an element of any surface, and dp an element of the bounding line, fj Vxoo.do-=J'co.dp. In other words,—The line-integral of any vector function of position in space for a closed line is equal to the surface-integral of the curl of that function for any surface bounded by the line. To prove this principle, we will consider the variation of the line-integral which is due to a variation in the closed line for which the integral is taken. We have, in the first place, Zffjo.dp— J'Sw.dp + fc*>.<5 dp. But w.S dp = d(oo.Sp) — dco.Sp. Therefore, since %fd((fco.Sp) — 0 for a closed line, Sfw.dp= J'Sco.dp —fdoa.Sp.34 VECTOR ANALYSIS. How •nd *,,s[| for the infinitesimal surface generated by the motion of the whole line. Hence, if we conceive of a closed curve passing gradually from an infinitesimal loop to any finite form, the differential of the line-integral of co for that curve will be equal to the differential of the surface integral of Vxa> for the surface generated: therefore, since both integrals commence with the value zero, they must always be equal to each other. Such a mode of generation will evidently apply to any surface closing any loop. 61. The line-integral of oo for a closed line bounding a plane surface dar infinitely small in all its dimensions is therefore Vxoo.dar. This principle affords a definition of Vxw which is independent of any reference to coordinate axes. If we imagine a circle described about a fixed point to vary its orientation while keeping the same size, there will be a certain position of the circle for which the line-integral of co will be a maximum, unless the line-integral vanishes for all positions of the circle. The axis of the circle in this position, drawn toward the side on which a positive motion in the circle appears counter-clockwise, gives the direction of Vx<*>, and the quotient of the integral divided by the area of the circle gives the magnitude of Vxw. V, V., and V x applied to Functions of Functions of Position. 62. A constant scalar factor after V, V., or Vx may be placed before the symbol. 63. If f(u) denotes any scalar function of u, and f(u) the derived function, v /*/ xr7 VAu)=f(u)Vu.VECTOR ANALYSIS. 35 64. If u or <*> is a function of several scalar or vector variables which are themselves functions of the position of a single point, the value of Vu or V.w or Vxw will be equal to the sum of the values obtained by making successively all but each one of these variables constant. 65. By the use of this principle we easily derive the following identical equations: V(£+w)=V£-f Vw. (i) = V.r + V.ft). Vx[t-|-<*>] = VxT-f-Vxie). (2) V (tu)=uVt 4- tVu. (3) V.(uft>) = ft). Vu-f-uV. co. (4) Vx[wft>] = uV xw — coxVu. (5) V.[TXw] = ft).Vxr —t.Vxw. (6) The student will observe an analogy between these equations and the formulae of multiplication. (In the last four equations the analogy appears most distinctly when we regard all the factors but one as constant.) Some of the more curious features of this analogy are due to the fact that the V contains implicitly the vectors i, j and hy which are to be multiplied into the following quantities. Combinations of the Operators V, V., and Vx. 66. If v, is any scalar function of position in space, VxV^=0, as may be derived directly from the definitions of these operators. 67. Conversely, if w is such a vector function of position in space that Vxw = 0, a) is the derivative of a scalar function of position in space. This will appear from the following considerations: The line-integral^/». dp will vanish for any closed line, since it may be expressed as the surface-integral of Vx<*>. (No. 60.) The line-integral taken from one given point P' to another given point P" is independent of the line between the points for which the integral is taken. (For, if two lines joining the same points gave different values, by reversing one we should obtain a closed line for which the integral would not vanish.) If we set u equal to this line-integral, supposing P" to be variable and P' to be constant in position, u will be a scalar function of the position of the point P", satisfying the condition du — w.dpy or, by No. 51, Vu = a>. There will evidently be an infinite number of functions satisfying this condition, which will differ from one another by constant quantities.36 VECTOR ANALYSIS. If the region for which Vxa> = 0 is unlimited, these functions will be single-valued. If the region is limited, but acyclic,* the functions will still be single-valued and satisfy the condition Vu — w within the same region. If the region is cyclic, we may determine functions satisfying the condition Vu = w within the region, but they will not necessarily be single-valued. 68. If a) is any vector function of position in space, V.Vxft)=0. This may be deduced directly from the definitions of No. 54. The converse of this proposition will be proved hereafter. 69. If u is any scalar function of position in space, we have by Nos. 52 and 54 i)«- 70. Def—If ft) is any vector function of position in space, we may define V.Vm by the equation V.Vû) = \dx2 dy .dar—jywxVu.dcr, where, as elsewhere in these equations, the line-integral relates to the boundary of the surface-integral. From this, by substitution of Vt for go, we may derive as a particular case ffVuxVt.d a is untenable may be shown in a similar manner. Therefore the value of u is constant. This proposition may be generalized by substituting the condition V.|Ww] = 0 for V.Vu = 0, t denoting any positive (or.any negative) scalar function of position in space. The conclusion would be the same, and the demonstration similar. 81. If throughout a certain space (which need not be continuous, and which may extend to infinity) V.Vw = 0, and in all the bounding surfaces the normal component of Vu vanishes, and at infinite distances within the space (if such there are) r2 = 0,40 VECTOR ANALYSIS. where r denotes the distance from a fixed origin, then throughout the space Vu — 0, and in each continuous portion of the same u = constant. For, if anywhere in the space in question Vu has a value different from zero, let it have such a value at a point P, and let u be there equal to b. Imagine a spherical surface about the above-mentioned origin as center, enclosing the point P, and with a radius r. Consider that portion of the space to which the theorem relates which is within the sphere and in which u and V.r=V.o), and in all the bounding surfaces the tangential components of r and co are equal, then throughout the space T = ft)* It is evidently sufficient to prove this proposition for a continuous space. Setting Vu-t — go, we have V.Vu = 0 for the whole space, * If a space encloses within itself another space, it is called periphractic, otherwise aperiphractic.42 VECTOR ANALYSIS. and u — constant for its boundary, which will be a single surface for a continuous aperiphractic space. Hence throughout the space Vw = t—ft> = 0. 88. If throughout an acyclic space contained within finite boundaries but not necessarily continuous VxT = Vxfc> and V.t = V.co, and in all the bounding surfaces the normal components of r and w are equal, then throughout the whole space T = ft). Setting Vw = T — ft>, we have V.Vit = 0 throughout the space, and the normal component of Vw at the boundary equal to zero. Hence throughout the whole space Vw = t —<0 = 0. 89. If throughout a certain space (which need not be continuous, and which may extend to infinity) V. Vr = V,Vft> and in all the bounding surfaces T = ft), and at infinite distances within the space (if such there are) T — ft), then throughout the whole space T = ft). This will be apparent if we consider separately each of the scalar components of r and <0. Minimum Values of the Volume-integral fff uoo.wdv. (Thomson's Theorems.) 90. Let it be required to determine for a certain space a vector function of position ft) subject to certain conditions (to be specified hereafter), so that the volume-integral fffuw.vdv for that space shall have a minimum value, u denoting a given positive scalar function of position. •.a. In the first place, let the vector w be subject to the conditions that V.ft) is given within the space, and that the normal component of ft) is given for the bounding surface. (This component must of course be such that the surface-integral of w shall be equal to the volume-integral fV.aodv. If the space is not continuous, this must be true of each continuous portion of it. See No. 57.) The solution is that Vx(uoo) = 0, or more generally, that the line-integral of uw for any closed curve in the space shall vanish.VECTOR ANALYSIS. 43 The existence of the minimum requires that Iff™ co.Sco dv = 0, while Soo is subject to the limitation that V.Sco = 0, and that the normal component of Sco at the bounding surface vanishes. To prove that the line-integral of uto vanishes for any closed curve within the space, let us imagine the curve to be surrounded by an infinitely slender tube of normal section dz, which may be either constant or variable. We may satisfy the equation V.&o = 0 by s d making &*> = 0 outside of the tube, and Scodz — oa-J? within it, 8a denoting an arbitrary infinitesimal constant, p the position-vector, and ds an element of the length of the tube or closed curve. We have then Sffr w.Soodv=:Jua>.Stodzd8=Ju(*>.dpSa = SaJu(t>.dp — 0, whence fu to.dp = 0. Q.E.D. We may express this result by saying that uto is the derivative of a single-valued scalar function of position in space. (See No. 67.) If for certain parts of the surface the normal component of to is not given for each point, but only the surface-integral of to for each such part, then the above reasoning will apply not only to closed curves, but also to curves commencing and ending in such a part of the surface. The primitive of uto will then have a constant value in each such part. If the space extends to infinity and there is no special condition respecting the value of to at infinite distances, the primitive of uw will have a constant value at infinite distances within the space or within each separate continuous part of it. If we except those cases in which the problem has no definite meaning because the data are such that the integral Juto.wdv must be infinite, it is evident that a minimum must always exist, and (on account of the quadratic form of the integral) that it is unique. That the conditions just found are sufficient to insure this minimum, is evident from the consideration that any allowable values of Sto may be made up of such values as we have supposed. Therefore, there will be one and only one vector function of position in space which satisfies these conditions together with those enumerated at the beginning of this number. b. In the second place, let the vector to be subject to the conditions that Vxco is given throughout the space, and that the tangential component of to is given at the bounding surface. The solution is that V,[w«] = 0,44 VECTOR ANALYSIS. and, if the space is periphractic, that the surface-integral of uoo vanishes for each of the bounding surfaces. The existence of the minimum requires that ffh oo. Soo dv = 0, while Soo is subject to the conditions that VxSoo = 0, and that the tangential component of Soo in the bounding surface vanishes. In virtue of these conditions we may set Soo = VSq, where Sq is an arbitrary infinitesimal scalar function of position, subject only to the condition that it is constant in each of the bounding surfaces. (See No. 67.) By substitution of this value we obtain ffju oo.VSqdv = 0, or integrating by parts (No. 76) ffu co. dar Sq —ff/V. [u dv = 0. Since Sq is arbitrary in the volume-integral, we have throughout the whole space V.[u a>] = 0; and since Sq has an arbitrary constant value in each of the bounding surfaces (if the boundary of the space consists of separate parts), we have for each such part JJuoo.dcr=0. Potentials, Newtonians, Laplacians. 91. Def.—If u' is the scalar quantity of something situated at a certain point p, the potential of u for any point p is a scalar function of p, defined by the equation , , u' potn — p 7---T, ip -/>Jo and the Newtonian of vJ for any point p is a vector function of p defined by the equation ' P ~~P / new u = —So u. LP ~AJo Again, if oo is the vector representing the quantity and direction of something situated at the point p, the potential and the Laplacian of co' for any point p are vector functions of p defined by the equations pot ft)' = lap oo' — oo [p'-pV p-p iy-p]o X oo'.VECTOR ANALYSIS. 45 92. If w or co is a scalar or vector function of position in space, we may write Pot u, New u, Pota>, Lap w for the volume-integrals of pot w', etc., taken as functions of p ; i.e., we may set Fotu=fff pot u'dv> =fJf[P'-pj0 New u =/JJnew u’dnf dv\ u'dv'y Fotu=Jff pot co w =fff^Zrpj(jdv’ Lap“=i^lap »'dtf where the p is to be regarded as constant in the integration. This extends over all space, or wherever the vf or w have any values other than zero. These integrals may themselves be called (integral) potentials, Newtonians, and Laplacians. d Pot u ^ , du d Pot a) t, , du* j — POX; -t y 7 — JL Ou 7 • dx dx dx dx This will be evident with respect both to scalar and to vector functions, if we suppose that when we differentiate the potential with respect to x (thus varying the position of the point for which the potential is taken) each element of volume dv' in the implied integral remains fixed, not in absolute position, but in position relative to the point for which the potential is taken. This supposition is evidently allowable whenever the integration indicated by the symbol Pot tends to a definite limit when the limits of integration are indefinitely extended. Since we may substitute y and 2 for x in the preceding formula, and since a constant factor of any kind may be introduced under the sign of integration, we have VPotu = Pot Vw, V. Pot w = Pot V. to, VxPot co = Pot V Xft), V. V Pot u = Pot V.Vw, V. V Pot« = Pot V. Vw, i.e., the symbols V, V., Vx, V.V may be applied indifferently before or after the sign Pot. Yet a certain restriction is to be observed. When the operation of taking the (integral) potential does not give a definite finite value, the first members of these equations are to be regarded as entirely indeterminate, but the second members may have perfectly definite values. This would be the case, for example, if u or co had a constant value throughout all space. It might seem harmless to set an indefinite expression equal to a definite, but it would be dangerous, since46 VECTOR ANALYSIS. we might with equal right set the indefinite expression equal to other definite expressions, and then be misled into supposing these definite expressions to be equal to one another. It will be safe to say that the above equations will hold, provided that the potential of u or oo has a definite value. It will be observed tfyat whenever Pot u or Pot ft) has a definite value in general (i.e., with" the possible exception of certain points, lines, and surfaces),* the first members of all these equations will have definite values in general, and therefore the second members of the equation, being necessarily equal to the first members, when these have definite values, will also have definite values in general. 94. Again, whenever Pot u has a definite value we may write V ?otu=Vfjf^ dv'=ff/V^u'dv', where r stands for [p'~ />]<>• But whence V Pot u = New u. Moreover, Neww will in general have a definite value, if Pot w has. 95. In like manner, whenever Pot eo has a definite value, VxPot «> = Vxfff- dv'=fffVx~ dv' =fff V-x»' dv’. tA/*/ ry% *AA/ ryi t/i/t/ /y* Substituting the value of V ^ given above we have VxPot w = Lap ft). Lap ft) will have a definite value in general whenever Pot ft) has. 96. Hence, with the aid of No. 93, we obtain VxLap ft) = Lap V Xft>, V.Lapft) = 0, whenever Pot w has a definite value. 97. By the method of No. 93 we obtain V.New u=u' dv=/JfVu'.^£-dv'. To find the value of this integral, we may regard the point p, which is constant in the integration, as the center of polar coordinates. Then r becomes the radius vector of the point p, and we may set dv' — r2dq dr, * Whenever it is said that a function of position in space has a definite value in general, this phrase is to be understood as explained above. The term definite is intended to exclude both indeterminate and infinite values.VECTOR ANALYSIS. 4 r where r2dq is the element of a spherical surface having center at p and radius r. We may also set r dvf dr * We thus obtain V.New u =fff^ dq dr=4arf^jj (hr = 47ru'r=„ -43ra'r=<» where u denotes the average value of it in a spherical surface of radius r about the point p as center. Now if Pot u has in general a definite value, we must have u' = 0 for v = oo. Also, V. New u will have in general a definite value. For r = 0, the value of u' is evidently u. We have, therefore, V. New u— — 47m, V. V Pot u — — 47m.* 98. If Pot« has in general a definite value, V. V Pot« = V. V Pot [ui+vj4-W&] = V. V Pot ui+V. V Pot vj-f V. V Pot wk = — 4nrui — 47rvj — 4s7rwk = —47Tft). Hence, by No. 71, V x VxPot ft)—VV. Pot co = 47T6D. That is, Lap Vx«>—Ne\v V.« = 47rft). If we set 1 T — —lXTrr cdj = Lap Vxco, ft)2=-j—New V.ft), we have <*) = ft)x 4* o)2, where co1 and co2 are such functions of position that V.ft)1 = 0, and Vxft)2=0. This is expressed by saying that cox is solenoidal, and co2 irrotational. Pot and Potft)2, like Pot w, will have in general definite values. It is worth while to notice that there is only one way in which a vector function of position in space having a definite potential can be thus divided into solenoidal and irrotational parts having definite potentials. For if ft)14-e, ft)2 — e are two other such parts, V.€ = 0 and Vxe=0. Moreover, Pot e has in general a definite value, and therefore e=-^ Lap Vxe—^ New V.e = 0. q.e.d. * Better thus: V.VPotu=ff/^V.Vudv=///V.^yujdv - f/fV.^uV^dv+f//uV.V^dv — ~ffu^\-da— ~ ^iru- [MS. note by author.]48 VECTOR ANALYSIS. 99. To assist the memory of the student, some of the principal results of Nos. 93-98 may be expressed as follows: Let w1 be any solenoidal vector function of position in space, 2 = 0, Vx2, or the scalar function u, ~ New and — V. are inverse operators; i.e., —-j^-New V.o>2 = ft>2, — V.-^-New u — u. Applied to the solenoidal function wv the operator V. gives zero; i.e. V.o)1 = 0. Since the most general form of a vector function having in general a definite potential may be written +2=? — Pot V.ft)2= — VV.i;Potft)2 = to2. With respect to any scalar or vector function having in general a definite potential ^ Pot and — V.V are inverse operators; i.e., —- Pot V.Vu = — V.-ji- Pot — V.V-i- Potu=u, 47T 47t 47r — Pot + ft)2] = — V.VPot [«! + 2] = + ft)2. With respect to the solenoidal function wlt —V.V and VxVx are equivalent; with respect to the irrotational function co2 V.V and W are equivalent; i.e., — V.Vio1 = VxVxcdt , V.Va)2=VV.ft)2.VECTOR ANALYSIS. 49 100. On the interpretation of the preceding formulae.—Infinite values of the quantity which occurs in a volume-integral as the coefficient of the element of volume will not necessarily make the value of the integral infinite, when they are confined to certain surfaces, lines, or points. Yet these surfaces, lines, or points may contribute a certain finite amount to the value of the volume-integral, which must be separately calculated, and in the case of surfaces or lines is naturally expressed as a surface- or line-integral. Such cases are easily treated by substituting for the surface, line, or point, a very thin shell, or filament, or a solid very small in all dimensions, within which the function may be supposed to have a very large value. The only cases which we shall here consider in detail are those of surfaces at which the functions of position (u or w) are discontinuous, and the values of Vu, Vx«, V.a> thus become infinite. Let the function u have the value ux on the side of the surface which we regard as the negative, and the value u2 on the positive side. Let Au = u2 — u1. If we substitute for the surface a shell of very small thickness a, within which the value of u varies uniformly as we pass through the shell, we shall have Vu — v within the shell, v denoting a unit normal on the positive side of the surface. The elements of volume which compose the shell may be expressed by a[dcr\0) where \dcr\ is the magnitude of an element of the surface, dor being the vector element. Hence, Vu dv — v Au\dar\0 = Au dcr. Hence, when there are surfaces at which the values of u are discontinuous, the full value of Pot Vu should always be understood as including the surface-integral /y* AU r n — i dcr' [p-p]o relating to such surfaces. (Au' and dcr' are accented in the formula to indicate that they relate to the point p.) In the case of a vector function which is discontinuous at a surface, the expressions V.codv and Vxoodv, relating to the element of the shell which we substitute for the surface of discontinuity, are easily transformed by the principle that these expressions are the direct and skew surface-integrals of w for the element of the shell. (See Nos. 55, 56.) The part of the surface-integrals relating to the edge of the element may evidently be neglected, and we shall have V. wdv = oo2. dcr—coj. dcr=Aw. dcr, V x <*> dv = dx = dcr X Aa>.50 VECTOR ANALYSIS. Whenever, therefore, to is discontinuous at surfaces, the expressions Pot V.to and New V.to must be regarded as implicitly including the surface-integrals JT ^ Ato'.c&r' and JXtA—^3 Ato'. da [fi -pi0 lp —pio respectively, relating to such surfaces, and the expressions PotVxw and Lap Vxwas including the surface-integrals JTr—/——=rdflr,xAft>/ and Jf A ~-1-3 X [da x Ato'] Lp —pJo Lp ~pJo respectively, relating to such surfaces. 101. We have already seen that if to is the curl of any vector function of position, V.co = 0. (No. 68.) The converse is evidently true, whenever the equation V.to = 0 holds throughout all space, and to has in general a definite potential; for then o^Vx^Lap co. Again, if V.to = 0 within any aperiphraetic space A, contained within finite boundaries, we may suppose that space to be enclosed by a shell B having its inner surface coincident >vith the surface of A. We may imagine a function of position to', such that to'= to in A, to'= 0 outside of the shell B, and the integral fffut.w dv for B has the least value consistent with the conditions that the normal component of to' at the outer surface is zero, and at the inner surface is equal to that of to, and that in the shell V.to' = 0 (compare No. 90). Then V.to' = 0 throughout all space, and the potential of to' will have in general a definite value. Hence, ft/=VxiLap“,) and to will have the same value within the space A. fl02. Def.—If to is a vector function of position in space, the Maxwellian * of to is a scalar function of position defined by the equation Max to =t fff—£¡3. to'dv'. [p — pio (Compare No. 92.) From this definition the following properties are easily derived. It is supposed that the functions to and u are such that their potentials have in general definite values. Max to = V. Pot to = Pot V. to, V Max to = VV.Pot to = New V.to, Max Vu= —4 iru,' 4s7tco = V xLap to — V Max to. * The frequent occurrence of the integral in Maxwell’s Treatise on Electricity and Magnetism has suggested this name. + [The foregoing portipn of this paper was printed in 1881, the rest in 1884.]VECTOR ANALYSIS. 51 If the values of Lap Lap w, New Max <*>, and Max New u are in general definite, we may add Pot »=Lap Lap w—New Max w, 4nr Pot u = — Max New u. In other words: The Maxwellian is the divergence of the potential, 2Iax . —JtT an<* ^ are ^nverse operators for scalars and irrotational vectors, for vectors in general — ^ V Max is an operator which separates the irrotational from the solenoidal part. For scalars and irrotational vectors, j-i Max New and —j1 New Max give the potential, for solenoidal vectors j— Lap Lap gives the potential, for vectors in general _ 1 47r x -j—* New Max gives the potential of the irrotational part, and ^ Lap Lap the potential of the solenoidal part. 103. Def,—The following double volume-integrals are of frequent occurrence in physical problems. They are all scalar quantities, and none of them functions of position in space, as are the single volume-integrals which we have been considering. The integrations extend over all space, or as far as the expression to be integrated has values other than zero. The mutual potential, or potential product, of two scalar functions of position in space is defined by the equation Pot(u, w) =ffffff^—^-dv dv' —fff u Pot wdv=fff w Pot u dv. In the double volume-integral, r is the distance between the two elements of volume, and u relates to dv as w' to dv'. The mutual potential, or potential product\ of two vector functions of position in space is defined by the equation Pot (*,«) -ffffff dv dv' =ffft. Pot <* dv =///«>. Pot * dv. The mutual Lapladan, or Laplacian product, of two vector functions of position in space is defined by the equation wo. <»)=////// *>•- ^3—x'dvdv' —fff <*>. Lap £ dv =fff . Lap ft) dv. The Newtonian product oi a scalar and a vector function of position in space is defined by the equation New(u9 «) =ffffffv • dv dv'—fff w. Newudv.52 VECTOR ANALYSIS. The Maxwellian product of a vector and a scalar function of position in space is defined by the equation Max (ft), u)—ffffff u. co' dv dv' —fff u Max codv=- New(u, <*>). It is of course supposed that u, w, 0, w are such functions of position that the above expressions have definite values. 104. By No. 97, 47m Pot w — — V. New u Pot — V. [New u Pot w] + New u.New w. The volume-integral of this equation gives 47r Pot (u, w) =ffflSvw u. New w dv, if the integral ff da. New u Pot w, for a closed surface, vanishes when the space included by the surface is indefinitely extended in all directions. This will be the case when everywhere outside of certain assignable limits the values of u and w are zero. Again, by No. 102, 47rco.Pot — VxLap xPot 0] + Lap co.Lap 0 — V. [Max a) Pot 0] + Max oo Max 0. The volume-integral of this equation gives 4tt Pot (0, ft>) =0 .Lap ft) dv +JJJM&X

, for a closed surface vanish when the space included by the surface is indefinitely extended in all directions. This will be the case if everywhere outside of certain assignable limits the values of 0 and w are zero. CHAPTER III. CONCERNING LINEAR VECTOR FUNCTIONS. 105. Def.—A vector function of a vector is said to be linear, when the function of the sum of any two vectors is equal to the sum of the functions of the vectors. That is, if f unc. [p -f p] = f unc. [p] -f f unc. [p] for all values of p and p, the function is linear. In such cases it is easily shown that func.[ap + bp + cp -fetc.] = afunc.[p]-fbfunc.[p]+cfunc.[p"]-(-etc.VECTOB ANALYSIS. 53 106. An expression of the form a A.p + /3 /A.p-j-e tc. evidently represents a linear function of p, and may be conveniently written in the form {aA+/3/a+etc.}. p. The expression p.a A-f p./3etc., or p .{«A+/3jul+etc.}, also represents a linear function of p, which is, in general, different from the preceding, and will be called its conjugate. 107. Def—An expression of the form a A or ftp will be called a dyad. An expression consisting of any number of dyads united by the signs + or — will be called a dyadic binomial, trinomial, etc., as the case may be, or more briefly, a dyadic. The latter terra will be used so as to include the case of a single dyad. When we desire to express a dyadic by a single letter, the Greek capitals will be used, except such as are like the Roman, and also A and 2. The letter I will also be used to represent a certain dyadic, to be mentioned hereafter. Since any linear vector function may be expressed by means of a dyadic (as we shall see more particularly hereafter, see No. 110), the study of such functions, which is evidently of primary importance in the theory of vectors, may be reduced to that of dyadics. 108. Def.—-Any two dyadics $ and'\Er are equal, when Q.p — '&.p for all values of p, or, when />.§? = p.SP* for all values of p, or, when cr&.p — cr.'k.p for all values of cr and of p. The third condition is easily shown to be equivalent both to the first and to the second. The three conditions are therefore equivalent. It follows that , and p. In the combination , as a postfactor. 110. If t is any linear function of p, and for p = i, p=j, p = h the values of t are respectively a, ¡3, and y, we may set and also T={cti+ßj+yk}.p, T~p.{ia-\zjß+lcy}. Therefore, any linear function may be expressed by a dyadic as prefactor and also by a dyadic as postfactor.54 VECTOK ANALYSIS. 111. Def.—We shall say that a dyadic is multiplied by a scalar, when one of the vectors of each of its component dyads is multiplied by that scalar. It is evidently immaterial to which vector of any dyad the scalar factor is applied. The product of the dyadic and the scalar a may be written either or 3?a. The minus sign before a dyadic reverses the signs of all its terms. 112. The sign + in a dyadic, or connecting dyadics, may be regarded as expressing addition, since the combination of dyads and dyadics with this sign is subject to the laws of association and commutation. 113. The combination of vectors in a dyad is evidently distributive. ’ [a + /3 4* etc.] [X + fx + etc.] = aX + oljul + /3X + + etc. We may therefore regard the dyad as a kind of product of the two vectors of which it is formed. Since this kind of product is not commutative, we shall have occasion to distinguish the factors as antecedent and consequent. 114. Since any vector may be expressed as a sum of i, j, and h with scalar coefficients, every dyadic may be reduced to a sum of the nine dyadS H, ij, ih ji, jj, jk, ki, kg, kk, with scalar coefficients. Two such sums cannot be equal according to the definitions of No. 108, unless their coefficients are equal each to each. Hence dyadics are equal only when their equality can be deduced from the principle that the operation of forming a dyad is a distributive one. On this account, we may regard the dyad as the most general form of product of two vectors. We shall call it the indeterminate product. The complete determination of a single dyad involves five independent scalars, of a dyadic, nine. 115. It follows from the principles of the last paragraph that if then and 2 a/3 = 2 /cX, 2aXj8 = 2 kxX, 2 a./3 = 2 /c.X. In other words, the vector and the scalar obtained from a dyadic by insertion of the sign of skew or direct multiplication in each dyad are both independent of the particular form in which the dyadic is expressed. We shall write $x and $g to indicate the vector and the scalar thus obtained. h-f (k&.i—i&. Ic)j -f (i.&j —j&.fyk,VECTOR ANALYSIS. 55 as is at once evident, if we suppose d? to be expanded in terms of ii, ij, etc. 116. Bef—The {direct) product of two dyads (indicated by a dot) is the dyad formed of the first and last of the four factors, multiplied by the direct product of the second and third. That is, {aft}.{yS}=a/3.yS = /3.y aS. The (direct) product of two dyadics is the sum of all the products formed by prefixing a term of the first dyadic to a term of the second. Since the direct product of one dyadic with another is a dyadic, it may be multiplied in the same way by a third, and so on indefinitely. This kind of multiplication is evidently associative, as well as distributive. The same is true of the direct product of a series of factors of which the first and the last are either dyadics or vectors, and the other factors are dyadics. Thus the values of the expressions a.$.e.^./3, a.$.e, $.e.^./3, $.e.^ will, not be affected by any insertion of parentheses. But this kind of multiplication is not commutative, except in the case of the direct product of two vectors. 117. Bef.—The expressions d?X/o and pxd? represent dyadics which we shall call the skew products of 3? and p. If 3? = aA + /3/* + etc., these skew products are defined by the equations 3>Xp = a AX/o 4-/3 fixp + etc., pX$ = pXa X+pxfi jut -f etc. It is evident that {px$}.^ = pX {«¡P.M'}, ^T.{^xp} = {¥.$} Xp, {/>X$}.a = /oX[^.a], a.{$X/)} = [a.$]xp, {pX$}Xa = pX{^Xa}. We may therefore write without ambiguity pxdP/T*, ^$Xp, /)X§.a, a.3>Xp, pX$.a. This may be expressed a little more generally by saying that the associative principle enunciated in No. 116 may be extended to cases in , which the initial or final vectors are connected with the other factors by the sign of skew multiplication. Moreover, a.pX# = [aXp].i* and $Xp.a = $.[pXa]. These expressions evidently represent vectors. So 'T“.{pX$} = {"IrXp}.$. These expressions represent dyadics. The braces cannot be omitted without ambiguity.56 VECTOR ANALYSIS. 118. Since all the antecedents or all the consequents in any dyadic may be expressed in parts of any three non-complanar vectors, and since the sum of any number of dyads having the same antecedent or the same consequent may be expressed by a single dyad, it follows that any dyadic may be expressed as the sum of three dyads, and so, that either the antecedents or the consequents shall be any desired non-complanar vectors, but only in one way when either the antecedents or the consequents are thus given. In particular, the dyadic aii+bij +cik + a'ji + b'jj -f c'jh a ki -|- b"kj -f- c' kk, which may for brevity be written is equal to where and to where ai + fij + y k, a = ai + a'j+a'% P = bi + b'j + b% y = d +C'j +C'% Í\+jfA + kv, \ — ai +bj +ck fi — a'i + b'j +c'k v — a"i + b"j+c"Je. 119. By a similar process, the sum of three dyads may be reduced to the sum of two dyads, whenever either the antecedents or the consequents are complanar, and only in such cases. To prove the latter point, let us suppose that in the dyadic aX + /3jul + yv neither the antecedents nor the consequents are complanar. The V6Ct0r {a\+li„ + yv}.p is a linear function of p which will be parallel to a when p is perpendicular to p and v, which will be parallel to /3 when p is perpendicular to v and X, and which will be parallel to y when p is perpendicular to X and p. Hence, the function may be given any value whatever by giving the proper value to p. This would evidently not be the case with the sum of two dyads. Hence, by No. 108, this dyadic cannot be equal to the sum of two dyads.VECTOR ANALYSIS. 57 120. In like manner, the sum of two dyads may be reduced to a single dyad, if either the antecedents or the consequents are parallel, and only in such cases. A sum of three dyads cannot be reduced to a single dyad, unless either their antecedents or consequents are parallel, or both antecedents and consequents are (separately) complanar. In the first case the reduction can always be made, in the second, occasionally. 121. Def.—A dyadic which cannot be reduced to the sum of less than three dyads will be called complete. A dyadic which can be reduced to the sum of two dyads will be called planar. When the plane of the antecedents coincides with that of the consequents, the dyadic will be called uniplanar. These planes are invariable for a given dyadic^ although the dyadic may be so expressed that either the two antecedents or the two consequents may have any desired values (which are not parallel) within their planes. A dyadic which can be reduced to a single dyad will be called linear. When the antecedent and consequent are parallel, it will be called unilinear. A dyadic is said to have the value zero when all its terms vanish. 122. If we set , , <7 = ,58 VECTOR ANALYSIS. and consider the limits within which cr varies, when we give p all possible values. The products 'Pxp and px§? are evidently planar dyadics. 124. Def.—A dyadic 3? is said to be an idem factor, when = for all values of p, or when p.$ = p for all values of p. If either of these conditions holds true, 3? must be reducible to the ii+jj + ich form Therefore, both conditions will hold, if either does. All such dyadics are equal,, by No. 108. They will be represented by the letter I. The direct product of an idemfactor with another dyadic is equal to that dyadic. That is, I.#> = $, $.1 = $, where d? is any dyadic. A dyadic of the form , , o ¿v , , J aa+ft3'+yy, in which a, ft, y are the reciprocals of a, /3, y, is an idemfactor. (See No. 38.) A dyadic trinomial cannot be an idemfactor, unless its antecedents and consequents are reciprocals. 125. If one of the direct products of two dyadics is an idemfactor, the other is also. For, if d?.Tr = I, cr.^.T^tr for all values of cr, and d> is complete ; o’.$.Tr.$ = cr.$ »for all values of cr, therefore for all values of o\#, and therefore Def.—In this case, either dyadic is called the reciprocal of the other. It is evident that an incomplete dyadic cannot have any (finite) reciprocal. Reciprocals of the same dyadic are equal. For if $ and Tr are both reciprocals of fi, If two dyadics are reciprocals, the operators formed by using these dyadics as prefactors are inverse, also the operators formed by using them as postfactors. 126. The reciprocal of any complete dyadic aX-j-/3/* + yy is X'ct'+ft ft + vy, where a, ft, y are the reciprocals of a, /3, y, and X', ft, v are the reciprocals of X, jjl, v. (See No. 38.)VECTOR ANALYSIS. 59 127. Def.—We shall write d?-1 for the reciprocal of any (complete) dyadic d?, also d?2 for d?.d?, etc., and d?"2, for d^.d?“1, etc. It is evident that d>"w is the reciprocal of d?n, 128. In the reduction of equations, if we have we may cancel the d? (which is equivalent to multiplying by d?”1) if $ is a complete dyadic, but not otherwise. The case is the same with such equations as d?../o, ^F.d^fi.d?, /o.d>=o\d>. To cancel an incomplete dyadic in such cases would be analogous to cancelling a zero factor in algebra. 129. Def.—If in any dyadic we transpose the factors in each term, the dyadic thus formed is said to be conjugate to the first. Thus aX-ffifx + yv and \a-±-fi/3 + vy are conjugate to each other. A dyadic of which the value is not altered by such transposition is said to be self-conjugate. The conjugate of any dyadic d? may be written d?0. It is evident that jb.d? = d>0-p and d>.p = p.d>c. d?0./o and d).p are conjugate functions of p. (See No. 106.) Since {#c}2={$2}o> we may write d?o> etc., without ambiguity. 130. The reciprocal of the product of any number of dyadies is equal to the product of their reciprocals taken in inverse order. Thus {d^F.Q} “1 = Q - ^ - ^d? - \ The conjugate of the product of any number of dyadies is equal to the product of their conjugates taken in inverse order. Thus {$.¥.Q}0=Qo.¥c.*o. Hence, since = = and we may write d^1 without ambiguity/ 131. It is sometimes convenient to be able to express by a dyadic taken in direct multiplication the same operation which would be effected by a given vector (a) in skew multiplication. The dyadic Ixa will answer this purpose. For, by No. 117, {IXa}./) = aX/o, /).{Ixa} =/)Xa, {Ixa}.d? = axd>, d>.{Ixa}=3>Xa. The same is true of the dyadic ax I, which is indeed identical with Ixa, as appears from the equation I.{axl} = {Ixa}.I.60 VECTOR ANALYSIS. If a is a unit vector, {Ixa}2= — {I — aa}, {Ixa}3= — I X a, {Ixa}4 = l — aa, {Ixa}5 = l X a, etc. If i, j, k are a normal system of unit vectors Ixi = i xl = kj —jk. I xj —j x I=ik—M, lxk = kxl=ji — If a and /3 are any vectors, [aX/3]xI = Ix[aX/3] = /3a — a/3. That is, the vector ax/3 as a pre- or post-factor in skew multiplication is equivalent to the dyadic {/3a — a/3} taken as pre- or post-factor in direct multiplication. [aX/3]xp = {/3a —a/3}.p, px[ax/3] = p.{/3a — a/3}. This is essentially the theorem of No. 27, expressed in a form more symmetrical, and more easily remembered. 132. The equation a/3xy + /3yXa + y aX/3 = a./3xyI , gives, on multiplication by any vector p, the identical equation p.a/3xy + p./3 yXa -J- p.y aX/3 = a./3xy p. (See No. 37.) The former equation is therefore identically true. (See No. 108.) It is a little more general than the equation aa' -J- fifi' + yy' = I, which we have already considered (No. 124), since, in the form here given, it is not necessary that a, /3, and y should be non-complanar. We may also write /3 X y a -f y x a /3 + a X /3 y = a. /3 X y 1. Multiplying this equation by p as prefactor (or the first equation by p as postfactor), we obtain p43xy a + p.yXa/3 + p.ax/3y = a./3xy p. (Compare No. 37.) For three complanar vectors we have a/3xy + /3yxa + yaX/3 = 0. Multiplying this by v, a unit normal to the plane of a, /3, and y we have a/3xy.r + fiyXa.v + yax/3.r = 0.VECTOR ANALYSIS. 61 This equation expresses the well-known theorem that if the geometrical sum of three vectors is zero, the magnitude of each vector is proportional to the sine of the angle between the other two. It also indicates the numerical coefficients by which one of three complanar vectors may be expressed in parts of the other two. 133. Def—If two dyadics 3? and Air are such that they are said to be homologous. If any number of dyadics are homologous to one another, and any other dyadics are formed from them by the operations of taking multiples, sums, differences, powers, reciprocals, or products, such dyadics will be homologous to each other and to the original dyadics. This requires demonstration only in regard to reciprocals. Now if $:4r = 'Er.$,- That is, 3?"1 is homologous to 'VEr, if is. 134. If we call or the quotient of ^ and #, we may say that the rules of addition, subtraction, multiplication and division of homologous dyadics are identical with those of arithmetic or ordinary algebra, except that limitations analogous to those respecting zero in algebra .must be observed with respect to all incomplete dyadics. It follows that the algebraic and higher analysis of homologous dyadics is substantially identical with that of scalars. 135. It is always possible to express a dyadic in three terms, so that both the antecedents and the consequents shall be perpendicular among themselves. To show this for any dyadic 4?, let us set p'=$./), p being a unit-vector, and consider the different values of p for all possible directions of p. Let the direction of the unit vector i be so determined that when p coincides with i, the value of p shall be at least as great as for any other direction of p. And let the direction of the unit vector j be so determined that when p coincides with j, the value of p shall be at least as great as for any other direction of p which is perpendicular to i. Let k have its usual position with respect to i and j. It is evidently possible to express $ in the form ai+/3j + yh We have therefore and p={ai+ffi + yk}.p9 dp = { ai+f3j+yk}. dp.62 VECTOR ANALYSIS. Now the supposed property of the direction of i requires that when p coincides with i and dp is perpendicular to i, dp' shall be perpendicular to p\ which will then be parallel to a. But if dp is parallel to j or k, it will be perpendicular to i, and dp will be parallel to /3 or y, as the case may be. Therefore /3 and y are perpendicular to a. In the same way it may be shown that the condition relative to j requires that y shall be perpendicular to /3. We may therefore set 3? = ai'i 4- bj'j+ck% where i\ jk\ like i, j; ky constitute a normal system of unit vectors (see No. 11), and a} b, c are scalars which may be either positive or negative. It makes an important difference whether the number of these scalars which are negative is even or odd. If two are negative, say a and by we may make them positive by reversing the directions of i' and j'. The vectors i', j\ k' will still constitute a normal system. But if we should reverse the directions of an odd number of these vectors, they would cease to constitute a normal system, and to be superposable upon the system iy jy k. We may, however, always set e^er = ai'i 4- bj'j 4- ck'ky or <3?= — { ai'i 4- bj'j 4- ck'k }, with positive values of a, 6, and c. At the limit between these cases are the planar dyadics, in whiclr one of the three terms vanishes, and the dyadic reduces to the form ai'i+bj'j, in which a and b may always be made positive by giving the proper directions to i' and /. If the numerical values of a, b, e are all unequal, there will be only one way in which the value of 3? may be thus expressed. If they are not all unequal, there will be an infinite number of ways in which $ may be thus expressed, in all of which the three scalar coefficients will have the same values with exception of the changes of signs mentioned above. If the three values are numerically identical, we may give to either system of normal vectors an arbitrary position. 136. It follows that any self-conjugate dyadic may be expressed in the form aii+bjj+ckk, where i, j, k are a normal system of unit vectors, and at b, c are positive or negative scalars. 137. Any dyadic may be divided into two parts, of which one shall be self-conjugate, and the other of the form Ixa. These parts are found by taking half the sum and half the difference of the dyadic and its conjugate. It is evident that VECTOR ANALYSIS. 63 Now 3?0} is self-conjugate, and (See No. 131.) Rotations and Strains. 138. To illustrate the use of dyadics as operators, let us suppose that a body receives such a displacement that p'=#./>, p and p being the position-vectors of the same point of the body in its initial and subsequent positions. The same relation will hold of the vectors which unite any two points of the body in their initial and subsequent positions. For if pv p2 are the original position-vectors of the points, and p/, p2 their final position-vectors, we have whence Pi — &Pi> P<> — Pz ~Pi =^-ÎPz~Pi]- In the most general case, the body is said to receive a homogeneous strain. In special cases, the displacement reduces to a rotation. Lines in the body initially straight and parallel will be straight and parallel after the displacement, and surfaces initially plane and parallel will be plane and parallel after the displacement. 139. The vectors ( v for the reciprocals of X, p, v, the vectors X', fi , v become by the strain a, ¡3, y Therefore the surfaces p'xv> v'xX', X'xp become /3xy, yXa, ctX/3. But pXv\ v'xX', X'xp are the reciprocals of pXv, vx\, \xp. The relation sought is therefore or =■ {/3xy pXv 4* yXa rXX + aX/3 XX/a}.cr. 140. The volume X'.p'xv becomes by the strain a./3xy. The unit of volume becomes therefore (a./3xy)(X.pXv). Def.—It follows that the scalar product of the three antecedents multiplied by the scalar product of the three consequents of a dyadic expressed as a trinomial is independent of the particular form in which the dyadic is thus expressed. This quantity is the determinant of the coefficients of the nine terms of the form aii bij+etc.,64 VECTOR ANALYSIS. into which the dyadic may be expanded. We shall call it the determinant of the dyadic, and shall denote it by the notation 1*1 when the dyadic is expressed by a single letter. If a dyadic is incomplete, its determinant is zero, and conversely. The determinant of the product of any number of dyadics is equal to the product of their determinants. The determinant of the reciprocal of a dyadic is the reciprocal of the determinant of that dyadic. The determinants of a dyadic and its conjugate are equal. The relation of the surfaces cr and . If is reducible to the form i'i -j- k'k, i,j, k, i\ j\ k' being normal systems of unit vectors (see No. 11), the body will suffer no change of form. For if p^xi+yj+zk, we shall have p'^xi'+yj'+zk'. Conversely, if the body suffers no change of form, the operating dyadic is reducible to the above form. In such cases, it appears from simple geometrical considerations that the displacement of the body may be produced by a rotation about a certain axis. À dyadic reducible to the form i'i+j'j+k'k may therefore be called a ver sor, 142. The conjugate operator evidently produces the reverse rotation. A versor, therefore, is the reciprocal of its conjugate. Conversely, if a dyadic is the reciprocal of its conjugate, it is either a versor, or a versor multiplied by —1; For the dyadic may be expressed in the form ai+f3j + yk. Its conjugate will be ia+jfi+ky. If these are reciprocals, we have {ai-f-ftj + yk}. {ia+ ky } = act+/3/3+ yy = I. But this relation cannot subsist unless a, /3, y are reciprocals to themselves, i.e., unless they are mutually perpendicular unit-vectors. Therefore, they either are a normal system of unit-vectors, or will become such if their directions are reversed. Therefore, one of the dyadics ai+fij+yk and — ai—/3j — yk is a versor. * [See note on p. 90.]YECTOK ANALYSIS. 65 The criterion of a versor may therefore be written 3>.#c = I, and | $ | = 1. For the last equation we may substitute |$|>0, or |#|5-1. It is evident that the resultant of successive finite rotations is obtained by multiplication of the versors. 143. If we take the axis of the rotation for the direction of i, i' will have the same direction, and the versor reduces to the form H+j'j+k'k, in which % j, k and i} j', k' are normal systems of unit vectors. We may set j' = cos qj + sin q k, ft = cos q k — sin qj,- and the versor reduces to ii + cos q {jj+kk} + sin q {kj -jk}, or ii + cos q {I—ii} + sin q Ixi, where q is the angle of rotation, measured from j toward k, if the versor is used as a prefactor. 144. When any versor $ is used as a prefactor, the vector — $x will be parallel to the axis of rotation, and equal in magnitude to twice the sine of the angle of rotation measured counter-clockwise as seen from the direction in which the vector points. (This will appear if we suppose to be represented in the form given in the last paragraph.) The scalar 4?g will be equal to unity increased by twice the cosine of the same angle. Together, — d?x and 4?g determine the versor without ambiguity. If we set the magnitude of 6 will be 0 = -*x 2 sing 2 -f 2 cos q or tan Jg, where q is measured counter-clockwise as seen from the direction in which 0 points. This vector 0, which we may call the vector semi-tangent of version, determines the versor without ambiguity. 145. The versor $ may be expressed in terms of 0 in various ways. Since # (as prefactor) changes a — 6xa into a + dxa (a being any vector), we have #={i+ixd}.{i-ixd}-1. Again 00-j-{I+Ix0}2 (1-0.0)1+200+21x0 1 + 0.0 1 + 0.066 VECTOK ANALYSIS. as will be evident on considering separately in the expression the components perpendicular and parallel to 0, or on substituting in ii + cos q (jj+kk) + sin q (kj —jk) for cos q and sin q their values in terms of tan If we set, in either of these equations, 0 = ai + bj-}- ck, we obtain, on reduction, the formula ( (1 + a2—62—c2)i%+(2ab — 2c)ij+(2ac + 2b)ik ) 4 -f- (2ab-j-2c)ji+(1 — a2-fb2 — c2)jj+(2be — 2a)jk [• ^ [ -f(2ac — 2b)ki+(2bc + 2a)kj+(l — a2 — b2+c2)kk J l+a2+&2+c2 ’ in which the versor is expressed in terms of the rectangular components of the vector semitangent of version. 146. If a, /3, y are unit vectors, expressions of the form 2aa —I, 2/3/3 — I, 2yy —I, are biquadrantal versors. A product like {2/3/3 —1} .{2aa — I} is a versor of which the axis is perpendicular to a and /3, and the amount of rotation twice that which would carry a to /3. It is evident that any versor may be thus expressed, and that either a or /3 may be given any direction perpendicular to the axis of rotation. If $={2/3/3-I}.{2aa-I}, and {2yy-I}.{2/3i8-I}, we have for the resultant of the successive rotations >M?= {2yy —1} ,{2aa —I}. This may be applied to the composition of any two successive rotations, /3 being taken perpendicular to the two axes of rotation, and affords the means of determining the resultant rotation by construction on the surface of a sphere. It also furnishes a simple method of finding the relations of the vector semitangents of version for the versors 3?, Ÿ, and Let 1 l+$s’ + Then, since $ = 4 a. /3/3a — 2aa — 2/3/3 +I> A _aX/3 01“ a.¡3 ’ which is moreover geometrically evident. In like manner, 02 = /3.y ’ d3 — aXy a,yVECTOR ANALYSIS 67 Therefore, a _[ax/3jx[/3xy3 axß.yß a.ßß.y ~a.ßß. y _ß.aßxy+ß.ßyxa+ß.yaxß a.ßß.y (See No. 38.) That is, ^iX02==^2“"a ^ % + @im Also, Hence, n zi _aX/3./3xy , a.y ^2~ a.ßß.y a.ßß.y 01X02 = 02-(1-01.02)03+0i> ZS _01 + 02 + 02^01 ö#~ 1 —0,-02 ’ which is the formula for the composition of successive finite rotations by means of their vector semitangents of version. 147. The versors just described constitute a particular class under the more general form aa 4- cos q {/3/3' + yy'} + sin q{y/3' — By}, in which a, /3, y are any non-complanar vectors, and a', /3', y their reciprocals. A dyadic of this form as a prefactor does not affect any vector parallel to a. Its effect on a vector in thé /3-y plane will be best understood if we imagine an ellipse to be described of which /3 and y are conjugate semi-diameters. If the vector to be operated on be a radius of this ellipse, we may evidently regard the ellipse with y, and the other vector, as the projections of a circle with two perpendicular radii and one other radius. A little consideration will show that if the third radius of the circle is advanced an angle q, its projection in the ellipse will be advanced as required by the dyadic prefactor. The effect, therefore, of such a prefactor on a vector in the fi-y plane may be obtained as follows: Describe an ellipse of which ¡3 and y are conjugate semi-diameters. Then describe a similar and similarly placed ellipse of which the vector to be operated on is a radius. The effect of the operator is to advance the radius in this ellipse, in the angular direction from /3 toward y, over a segment which is to the total area of the ellipse as q is to 2x. When used as a postfactor, the properties of the dyadic are similar, but the axis of no motion and the planes of rotation are in general different. Def.—Such dyadies we shall call cyclic. The Nth power (N being any whole number) of such a dyadic is obtained by multiplying q by N. If q is of the form 27rN/M (N and M being any whole numbers) the Mth power of the dyadic will be an idemfactor. A cyclic dyadic, therefore, may be regarded as a root of68 VECTOR ANALYSIS. I, or at least capable of expression with any required degree of accuracy as a root of I. It should be observed that the value of the above dyadic will not be altered by the substitution for a of any other parallel vector, or for ¡3 and y of any other conjugate semi-diameters (which succeed one another in the same angular direction) of the same or any similar and similarly situated ellipse, with the changes which these substitutions require in the values of a, ¡3', y. Or, to consider the same changes from another point of view, the value of the dyadic will not be altered by the substitution for a of any other parallel vector or for ¡3' and y of any other conjugate semi-diameters (which succeed one another in the same angular direction) of the same or any similar and similarly situated ellipse, with the changes which these substitutions require in the values of a, ¡3, and y, defined as reciprocals of a, /3', y. 148-.- The strain represented by the equation P = {aii+bjj+ckk}.p where a, 6, c are positive scalars, may be described as consisting of three elongations (or contractions) parallel to the axes i, j, k, which are called the principal axes of the strain, and which have the property that their directions are not affected by the strain. The scalars a, 5, c are called the principal ratios of elongation. (When one of these is less than unity, it »represents a contraction.) The order of the three elongations is immaterial, since the original dyadic is equal to the product of the three dyadics aii+jj+kk, ii+bjj+hh ii+jj+ckk taken in any order. Def.—A dyadic which is reducible to this form we shall call a right tensor. The displacement represented by a right tensor is called a pure strain. A right tensor is evidently self-conjugate. 149. We have seen (No. 135) that every dyadic may be expressed in the form dt {ai'i+bfj -f ck'k }, where a, 6, c are positive scalars. This is equivalent to ± {ai'i' -J- bj'j' 4- ck'k'}. {i'i +j'j -f- k'k) and to ± { i'i + j'j -\-k'k}.{ aii -f bjj+ckk}. Hence every dyadic may be expressed as the product of a versor and a right tensor with the scalar factor ± 1. The versor may precede or follow. It will be the same versor in either case, and the ratios of elongation will be the same; but the position of the principal axes of the tensor will differ in the two cases, either system being derived from the other by multiplication by the versor.VECTOR ANALYSIS. Def.—The displacement represented by the equation p = -p is called inversion. The most general case of a homogeneous strain may therefore be produced by a pure strain and a rotation with or without inversion. 150. If $ = ai'i + bj'j+ck'k, $.$c = a2i'i'+52// 4* c2k'k\ and 3?c& = a2ii-\- b2jj + c2 kk. The general problem of the determination of the principal ratios and axes of strain for a given dyadic may thus be reduced to the case of a right tensor. 151. Def.—The effect of a prefactor of the form aa a + 5/3/3' -h Cyy', where a, b, c are positive or negative scalars, a, /3, y non-eomplanar vectors, and a, /3', y their reciprocals, is to change a into aa, /3 into 5/3, and y into cy. As a postfactor, the same dyadic will change a into aa -\-pfift +pyy, act + cosq{/3ft+yy} + sin q{yft-fiy }, of which the order is immaterial, and if we suppose the vector on which we operate to be resolved into two factors, one parallel to a, and the other in the fty plane. The effect of the first factor is to multiply by a the component parallel to a, without affecting the other. The effect of the second is to multiply by p the component in the fty plane without affecting the other. The effect of the third is to give the component in the fty plane the kind of elliptic rotation described in No. 147. The effect of the same dyadic as a postfactor is of the same nature. The value of the dyadic is not affected by the substitution for a of another vector having the same direction, nor by the substitution for ¡3 and y of two other conjugate semi-diameters of the same or a similar and similarly situated ellipse, and which follow one another in the same angular direction. Def.—Such dyadics we shall call cyclotonic. 154. Cyclotonics which are reducible to the same form except with respect to the values of a, p, and q are homologous. They are multiplied by multiplying the values of a, and also those of p, and adding those of q. Thus, the product of «1««' + Pi cos gx {/3/3'+yy} + px sin q1 {yft - $y] and a2aa' + p2 cos ?2{^+yy'} + p2 sin q2{yft-fiy) is a^aa + cos (gx+q2) {+yy} +PiPi sin (ii+q2) {yP-fty}. A dyadic of this form, in which the value of q is not zero, or the product of 7r and a positive or negative integer, is homologous only with such dyadics as are obtained by varying the values of a, p, and q. 155. In general, any dyadic may be reduced to the form either of a tonic or of a cyclotonic. (The exceptions are such as are made by the limiting cases.) We may show this, and also indicate how the reduction may be made, as follows. Let $ be any dyadic. We have first to show that there is at least one direction of p for which $.p = ap. This equation, is equivalent to 3>.p — ap = 0} {$—aI}./o = 0. or,VECTOR ANALYSIS. . 71 That is, al is a planar dyadic, which may be expressed by the equation |$_olj-0. (See No. 140.) Let $=Xi+W'+^; the equation becomes \[\-ai]i+[p-aj]j+[v-a1c]1c\ =0, or, [X — ai] x [fL—af\. [v—a&] = 0, or, az~-(i.\+j.fjL+k.v)a2+(i.iuLXv+j.vx\+JC'\x/uL)a—Xx/U.v = 0. This may be written a3-#sa2+ {#-*}B |# |a-1$ | = 0 * Now if the dyadic $ is given in any form, the scalars *8, {$_1}s> |*| are easily determined. We have therefore a cubic equation in a, for which we can find at least one and perhaps three roots. That is, we can find at least one value of a, and perhaps three, which will satisfy the equation |$-al| =0. By substitution of such a value, $ — al becomes a planar dyadic, the planes of which may be easily determined.! Let a be a vector normal to the plane of the consequents. Then {$-e*I}.a = 0, $.a —aa. If # is a tonic, we may obtain three equations of this kind, say-$. a = aa, §>. /? = b/3, 3?.y = cy, in which a, /?, y are not complanar. Hence (by No. 108), = aaa -j- b/3/3' + Cyy', where a', /3', y are the reciprocals of a, fi, y. In any case, we may suppose a to have the same sign as |$|, since the cubic equation must have such a root. Let a (as before) be normal to the plane of the consequents of the planar §> — al, and a normal to the plane of the antecedents, the lengths of a and a being such that a. a' = l.t Let /3 be any vector normal to a', and such that 4>./3 is not parallel to /3. (The case in which is always parallel to /3, if /3 is perpendicular to a', is evidently that of a tonic, and needs no farther discussion.) {$—al}./3 and therefore $>./3 will be perpendicular to a. The same will be true of 3?2./3. Now (by No. 140) [$.a].[$2./3]x[$.i8] = |#|a.[$./3]xA that is, aa.[#2./3]x[$./3] = |*| a.[$./3]x/3. * [See note on p. 90.] t In particular cases, - al may reduce to a linear dyadic, or to zero. These, however, will present no difficulties to the student. X For the case in which the two planes are perpendicular to each other, see No. 157.72 VECTOR ANALYSIS. Hence, since [<£2./3]x[3>./3] and [3>./3]x/3 are parallel, a[$*. 3] x [*. 0] = |*| [*./8]xft Since a-1|$| is positive, we may set ^ -1 | ^ j ^ If we also set ft =P~1$-fi, ft=:P~2$2-ft etc., ft/3-2=p2§~2./3, etc., the vectors ft ft, ft, etc., ft1? ft2, etc., will all lie in the plane perpendicular to a', and we shall have ift X /ft. ~ /ft X ft [ft+ft]x ft = 0- We may therefore set ft + /? = 27zft. Multiplying by and by p*_1, ft+ft = 2^2 » ft+ft = 2™ft3» etc., fti+ft-i^^ft /8+/3_2 = 2^i8_1, etc. Now, if > 1, and we lay off from a common origin the vectors ft ft, ft, etc., ft13 ft2, etc., the broken line joining the termini of these vectors will be convex toward the origin. All these vectors must therefore lie between two limiting lines, which may be drawn from the origin, and which may be described as having the directions of ft and ft«.* A vector having either of these directions is unaffected in direction by multiplication by 3?. In this case, therefore, $ is a tonic. If n < — 1 we may obtain the same result by considering the vectors ft ""ft> ft» “"ft’ ft» etc-> j8-2» e^C*» except that a vector in the limiting directions will be reversed in direction by multiplication by 3?, which implies that the two corresponding coefficients of the tonic are negative. If 1 > n > — l,t we may set n~ cosq. Then /3 _ x + ft = 2 cos q ft Let us now determine y by the equation ft==cosg/3+singy. This gives /^ -1= cos # ft ~ sin # y• Now a' is one of the reciprocals of a, ft and y. Let ft and y' be the others. If we set \j> = cos q {ftft'+yy'} + sin g {yft — ft'}, we have 'Tr.a = 0, 'Tr./8 = ft, 'Tr./8_1 = ft * The termini of the vectors will in fact lie on a hyperbola, t For the limiting cases, in which n—1, or n— -1, see No. 156.VECTOR ANALYSIS. 78 Therefore, since {aaa }. a — aa = 3?. a, { aaa -^p^ } • ft =Pfti = A {aaa+_p“^}./8_1=^i8 = i>.i8<1, it follows (by No. 108) that $ — aaa+p'$r = aaa + p cos qiftft' + yy} + psmq{yft' fty}. 156. It will be sufficient to indicate (without demonstration) the forms of dyadics which belong to the particular cases which have been passed over in the preceding paragraph, so far as they present any notable peculiarities. If n— ±1 (page 72), the dyadic may be reduced to the form aaa'+b{ ft ft' yy } + hefty, where a, ft, y are three non-complanar vectors, a, ft', y their reciprocals, and a, h, c positive or negative scalars. The effect of this as an operator, will be evident if we resolve it into the three homologous factors aaa + ftft' + yy, aa + h {ft ft' + yy} > aa + ft ft' + yy + efty. The displacement due to the last factor may be called a simple shear. It consists (when the dyadic is used as prefactor) of a motion parallel to ft, and proportioned to the distance from the a-ft plane. This factor may be called a shearer. This dyadic is homologous with such as are obtained by varying the values of a, h, e, and only with such, when the values of a and h are different, and that of c other than zero. 157. If the planar # — aJ. (page 71) has perpendicular planes, there may be another value of a, of the same sign as |d? |, which will give a planar which has not perpendicular planes. .When this is not the case, the dyadic may always be reduced to the form a {aa' + ft ft' + yy} + o&6{ aft'+fty }Jrae ay\ where a, ft, y are three non-complanar vectors, a, ft', y, their reciprocals, and a, h, c, positive or negative scalars. This may be resolved into the homologous factors al and I + h{aft' + fty'}+ cay. The displacement due to the last factor may be called a complex shear. It consists (when the dyadic is used as prefactor) of a motion parallel to a which is proportional to the distance from the a-y plane, together with a motion parallel to hft-\-ca which is proportional to the distance from the a-ft plane. This factor may be called a complex shearer74 VECTOK ANALYSIS. This dyadic is homologous with such as are obtained by varying the values of a, b, c, and only such, unless b — 0. It is always possible to take three mutually perpendicular vectors for a, /3, and y; or, if it be preferred, to take such values for these vectors as shall make the term containing c vanish. 158. The dyadics described in the two last paragraphs may be called shearing dyadics. The criterion of a shearer is {$-I}3 = 0, #-I=f=0. The criterion of a simple shearer is {$-1)3=0, $-14=0. The criterion of a complex shearer is {$-1)3 = 0, {$-1)2^0. Note.—If a dyadic $ is a linear function of a vector p (the term linear being used in the same sense as in No. 105), we may represent the relation by an equation of the form =aj8 y. p + ef rj.p + etc., or $={afiy + eft + etc.}. p, where the expression in the braces may be called a triadic polynomial, and a single term afSy a triad, or the indeterminate product of the three vectors a, p, y. We are thus led successively to the consideration of higher orders of indeterminate products of vectors, triads, tetrads, etc., in general polyads, and of polynomials consisting of such terms, triadics, tetradics, etc., in general polyadics. But the development of the subject in this direction lies beyond our present purpose. It may sometimes be convenient to use notations like \ v q_j \ /*> v \*,P,y A 71 to represent the conjugate dyadics which, the first as prefactor, and the second as postfactor, change a, p, y into X, y, v. respectively. In the notations of the preceding chapter these would be written J Xa' + ju/S'-fi"/ and a'X + ftp. + y'v respectively, o', /S', y' denoting the reciprocals of a, p, y. If r is a linear function of p, the dyadics which as prefactor and postfactor change p into r may be written respectively ~ and -I-. Ip Pi If t is any function of p, the dyadics which as prefactor and postfactor change dp into dr may be written respectively dr , dr -t-tt and -j—. \dp dp\ In the notation of the following chapter the second of these (when p denotes a position-vector) would be written Vr. The triadic which as prefactor changes dp into dr_ I dp may be written written d2r dP2\ * d?r I d?' and that which as postfactor changes dp into ~ may be dp\ The latter would be written VVr in the notations of the following chapter.VECTOR ANALYSIS. 75 CHAPTER IY. (Supplementary to Chapter II.) CONCERNING THE DIFFERENTIAL AND INTEGRAL CALCULUS OF VECTORS. 159. If to is a vector having continuously varying values in space, and p the vector determining the position of a point, we may set p = xi + yj+zk, dp = dxi+dy j+dz ky and regard w as a function of p, or of x, y, and 0. Then, 7 7 d(0 1 d(0 1 dco dw=dxdi+dyd^+-dz dz9 that is, j 7 ( . day . . d to , 7 do>1 da>=dP-VTx+yc%+kdzf- If we set V“^dx+3dy+ICdz’ do) — dp.Vco. Here V stands for . d , . d , 7 d % dx ^ dy dz9 exactly as in No. 52, except that it is here applied to produces a dyadic, while in the former case it was scalar and produced a vector. The dyadic V<*> represents the nine differential coefficients of the three components of a> with respect to x, y, and 0, just as the vector Vu (where u is a scalar function of p) represents the three differential coefficients of the scalar u with respect to x, y, and 0. It is evident that the expressions V.o> and Vxco already defined (No. 54) are equivalent to {Vco}g and {Vo>}x. 160. An important case is that in which the vector operated on is of the form Vu. We have then where dVu = dp .VVu, Wu = d2u .. dx2 ** + , d2u .. , + +wkki + d2u .. dx dy ^ dhi .. dkJi3J d*u dzdy kj + d2u dxdz d2u dydz' d?u dz2 ik jk kk. This dyadic, which is evidently self-conjugate, represents the six76 VECTOR ANALYSIS. differential coefficients of the second order of u with respect to x, y, and z* 161. The operators Vx and V. may be applied to dyadics in a manner entirely analogous to their use with scalars. Thus we may define Vx# and V.# by the equations Then, if Or, if V7 ^ • x3£+*x- ' dz *dz’ * dx $ — ai+ftj+yk, Vx#=Vxo^ + Vx/îj-f Vxy&. V.# = V. ai + V. j3j + V.y £ # = 'ia+jf/3 + ^y, V.# = jdfi c?y dx^dy^ dz' 162. We may now regard V.V in expressions like V.Vco as representing two successive operations, the result of which will be d2œ d2 œ d2o) d^+dy2+d^ in accordance with the definition of No. 70. We may also write V.V# for , etc. t See footnote to No. 160.VECTOR ANALYSIS. 77 164. The following equations between surface-integrals for a closed surface and volume-integrals for the space enclosed seem worthy of mention. One or two have already been given, and are here repeated for the sake of comparison. ffda u =fff dv Vu, (1) ffdw^fffdvVo,, (2) ffd=fffdvV.o>, (3) ffd — VcoXt, (2) Vx[tXO>] = ft).VT — V.Tto —T.Vft>-b V.itfT, (3) V(t.O)) = Vt.O) + ViO.T, (4) V.{rio} = V.TOi+r.Vft), (5) Vx{tco} = Vxtco —txVco, (6) V.{w4?} = Vu.$+^V.$, (7) etc. The principle in all these cases is that if we have one of the operators V, V., Vx prefixed to a product of any kind, and we make any78 VECTOR ANALYSIS. transformation of the expression which would be allowable if the V were a vector (viz., by changes in the order of the factors, in the signs of multiplication, in the parentheses written or implied, etc.), by which changes the V is brought into connection with one particular factor, the expression thus transformed will represent the part of the value of the original expression which results from the variation of that factor. 167. From the relations indicated in the last four paragraphs, may be obtained directly a great number of transformations -of definite integrals similar to those given in Nos. 74-77, and corresponding to those known in the scalar calculus by the name of integration by parts. 168. The student will now find no difficulty in generalizing the. integrations of differential equations given in Nos. 78-89 by applying to vectors those which relate to scalars, and to dyadics those which relate to vectors. 169. The propositions in No. 90 relating to minimum values of the volume-integral fff uoo.wdv may be generalized by substituting <*>.#.w for uco.ft), # being a given dyadic function of position in space. 170. The theory of the integrals which have been called potentials, Newtonians, etc. (see Nos. 91-102) may be extended to cases in which the operand is a vector instead of a scalar or a dyadic instead of a vector. So far as the demonstrations are concerned, the case of a vector may be reduced to that of a scalar by considering separately its three components, and the case of a dyadic may be reduced to that of a vector, by supposing the dyadic expressed in the form (pi + xj+ and considering each of these terms separately. CHAPTER Y. CONCERNING TRANSCENDENTAL FUNCTIONS OF DYADICS. 171. Def.—The exponential function, the sine and the cosine of a dyadic may be defined by infinite series, exactly as the corresponding functions in scalar analysis, viz., e*=I+$++o $3+etc., sin $ = $—a7s$3+uxi^ - etc-. cos $=I—li>2+2X1 $4—ete. These series are always convergent. For every value of # there is one and only one value of each of these functions. The exponential function may also be defined as the limit of the expression #\NVECTOK ANALYSIS. 79 when N, which is a whole number, is increased indefinitely. That this definition is equivalent to the preceding, will appear if the expression is expanded by the binomial theorem, which is evidently applicable in a case of this kind. These functions of $ are homologous with 3?. 172. We may define the logarithm as the function which is the inverse of the exponential, so that the equations , e - s. * — cos 3? — H sin 4>, whence cosd? = J{eB*+e~s*}, sin#=-¿H{e3*-e-a*}. 175. If $.¥=¥.$ = 0, {$+¥}2=$2-f-¥2, {$+^r}3=$34-^r3, etc. ¿*+* = ¿* + 6*-1, cos{$+¥} =cos$H-cosŸ—I, sin{$+Sk} == sin 4? -f sin Ÿ. |e*| = e*8. For the first member of this equation is the limit of |{I+N-1$}B|, that is, of |I-fN~1$|N. If we set $ = ai+fij+yk, the limit becomes that of (l+N-^.i+N-^+N-Vfc)*, or (l+N-1#^, the limit of which is the second member of the equation to be proved. 177. By the definition of exponentials, the expression eq{kj-jk} represents the limit of {i+gN-M#-#}}M. Therefore 176.80 VECTOR ANALYSIS. Now I+gN ~1{kj —jk} evidently represents a versor having the axis i and the infinitesimal angle of version gN"1. Hence the above exponential represents a versor having the same axis and the angle of version q. If we set qi = , and the angle of rotation is equal to thé magnitude of w. The value of the versor will not be affected by increasing or diminishing the magnitude of a) by 2 7t. 178. If, as in No. 151, # = aaa -f 5/3/3' + eyy', the definitions of No. 171 give e* = eaaa + e&/3/3'+ecyy, cos $ = cos a act + cos b ¡8/3' + cos c yy, sin $ = sin a ad + sin b /3/3' + sin c yy'. If a, 5, c are positive and unequal, we may add, by No. 172, log d? = log a aa' -f log b /3/3' + log c yy. 179. If, as in No. 153, $=aaa -f b{ /3/3'+yy} -f c{y/3'- /3y'} = aaa + p cos q {/3/3' + yy'} + p sin q{y/3' - /3y'}, we have by No. 173 — gaaa' ^ g&l^+vy'} __ ^ But = eaad+ft ft + yy, eb{f3fi’+yy'} — ad -j. eb{/3j3'4-yy'}, ec{y^-fiy,} = aa/ + cosc{/3/3' + yy'} + sin£{y/3' — /3y'}. Therefore, e$ _ eaaa' q. cos c yy'} sjn c {y/J'—£y'}. Hence, if a is positive, log $=log a aa' -f log p {/3/3'-f y y} + g { y/3'—/3y'}. Since the value of 3? is not affected by increasing or diminishing q by 27r, the function log # is many-valued. To find the value of cos $ and sin 3>, let us set © == 5 {/3/3' H- y y'} -f c {yP - /3y'}, S = y/3'-/3y'. Then, by No, 175, cos $ = cos {aaa'} -f- cos 0 — 1. cos {aaa'} — I = cos aaa' — aa'. 1 heref ore, cog __ cog aaa' _ aa' ^ cog qVECTOR ANALYSIS. 81 Now, by No. 174, eos 0 = J {es*® 4 e ~s*0}. Since S.0= -c{/?/3'+yy' } + b{yß-ßy}, en'& = aa 4 e_ccos& {ßß'4yy'} 4 e~c$mb {yß' — ßy}, rs-0 = aa 4 eceos & {ßß'4yy'} — ecsin 6 {yß' — ßy'}. Therefore cos 0 = ad 4 |(ec4e"c)cos& {ßß'4yy} — \(ec — e_c)sin b {yß' — ßy}, and cos$ = cosaaa' 4- ^(ec+e~c)cos b {ßß' + yy} — i(e° — e~c)sin b{yßf—ßy}. In like manner we find sin«£ = sinaaa/4 |(ec + e"c)sin6 {ßß'4yy'} 4 J(ec—e-c)cos & {yß' — ßy'}. 180. If a, ß, y and a, ß', y are reciprocals, and 4? = a aa+ b { ßß' 4 yy) 4 c ßy, and N is any whole number, $N = aNaa'46N{j8/3/4yy/}4 N6N_1c ßy. Therefore, ^-= eaaa 4e&{ßß'4yy'}4$cßy, cos 4? = cos a aa 4 cos b {ßß' 4 yy'} — c sin b ßy, sin 4? = sin a aa 4 sin b {ßß' 4 yy} 4 c cos b ßy. If a and b are unequal, and c other than zero, we may add log 4? = log a aa' 4 log b {ßß' 4 yy } 4 cb ~1 ßy. 181. If a, ß, y, and a, ß', y are reciprocals, and # = al4&{aß'4ßy'}4c ay', and N is a whole number, = aNI 4 NaN~> b{aß' 4 ßy} 4 (Na^c 4 |N (N - l)aN-262)ay'. Therefore e* = eaI+ea b{aß' 4 ßy'} 4 {\b2 4 c)ay', cos = cos a I — b sin a {aß' 4 ßa'} — (Jfr2cos a 4 c sin a)ay', sin 4? — sin a 14 b cos a {aß' 4ßa'} — (J62sin a — c cos <%)ay'. Unless b — 0, we may add log <£ = log a 14 bet ~1 {aß' 4 ßa'} 4 (ca'"1 — ¿6% ~2) ay 182. If we suppose any dyadic 4? to vary, but with the limitation that all its values are homologous, we may obtain from the definitions of No. 171 cl{e*} — d sin # = cos $. d§, d cos 3? = — sin . d§, d\og<0> — §~1.d<&, (1) (2) (3) (4)82 VECTOK ANALYSIS. as in the ordinary calculus, but we must not apply these equations to cases in which the values of <3? are not homologous. 183. If, however, T is any constant dyadic, the variations of ¿T will necessarily be homologous with ¿17, and we may write without other limitation than that F is constant, u II Ih (1) ^ir}=r.cos{iri, (2) ^r} = ^r.sin{ir}) (3) diog{*r}_i dt ~~ t (4) A second differentiation gives ¿2{«ir}_rVr dt2 (5) (6) (7) 184. It follows that if we have a differential equation of the form dp -p Tt = V-p’ the integral equation will be of the form tv / P = e .p, p representing the value of p for ¿ = 0, For this gives dp -p tv , p m = T.e .P = I>, and the proper value of p for ¿ = 0. 185. Def.—A flux which is a linear function of the position-vector is called a homogeneous-strain-flux from the nature of the strain which it produces. Such a flux may evidently be represented by a dyadic. In the equations of the last paragraph, we may suppose p to represent a position-vector, t the time, and T a homogeneous-strain-flux. Then etv will represent the strain produced by the flux T in the time t. In like manner, if A represents a homogeneous strain, {log A}/t will represent a homogeneous-strain-flux which would produce the strain A in the time t.VECTOR ANALYSIS. 83 186. If we have — =T2 p dt2 1 -p’ where T is complete, the integral equation will be of the form P = etT.a + e-tv.l3. For this gives , J = lVr.a--r.e-ir./3, §=r \etT.a + T2.e~tv.ß=F2.p, dP and a and 0 may be determined so as to satisfy the equations Pt=0 = a+fi> 187. The differential equation 4=_pp dt2 1 ,p will be satisfied by whence p = cos{tfr}.a -f sin{^r}./3, -r.sin{ir}.a + r.coa{tT}.ß, -p.cos{ir}.a - r2,sin{ir}./8= -rv If r is complete, the constants a and /3 may be determined to satisfy the equations Pt-=Q~a’ 188. If aj,-.-1* Q-P-a'}.* where T2 —A2 is a complete dyadic, and r.A=A.r==o, we may set p = {¿etr+Je' lL--* 189. It will appear, on reference to Nos. 155-157, that every complete dyadic may be expressed in one of three forms, viz., as a square, as a square with the negative sign, or as a difference of squares of two dyadics of which both the direct products are equal to zero. It follows that every equation of the form P> where 0 is any constant and complete dyadic, may be integrated by the preceding formulae. NOTE ON BIVECTOB ANALYSIS.* 1. A vector is determined by three algebraic quantities. It often occurs that the solution of the equations by which these are to be determined gives imaginary values, i.e., instead of scalars we obtain biscalars, or expressions of the form a+ib, where a and b are scalars, and L — fJ — 1. It is most simple, and always allowable, to consider the vector as determined by its components parallel to a normal system of axes. In other words, a vector may be represented in .the ^orm xi 4- yj+zh. Now if the vector is required to satisfy certain conditions, the solution of the equations which determine the values of x, y, and z, in the most general case, will give results of the form X — X^lX g, y=yi+iV2> » = ^ + £02, * Thus far, in accordance with the purpose expressed in the footnote on page 17, we have considered only real values of scalars and vectors. The object of this limitation has been to present the subject in the most elementary manner. The limitation is hov/ever often inconvenient, and does not allow the most symmetrical and complete development of the subject in many important directions. Thus in Chapter V, and the latter part of Chapter III, the exclusion of imaginary values has involved a considerable sacrifice of simplicity both in the enunciation of theorems and in their demonstration. The student will find an interesting and profitable exercise in working over this part of the subject with the aid of imaginary values, especially in the discussion of the imaginary roots of the cubic equation on page 71, and in the use of the formula = cos 3> -ft sin 3> in developing the properties of the sines, cosines, and exponentials of dyadics.VECTOR ANALYSIS. 85 where x1} x2, yx, y2, zx, z2 are scalars. Substituting these values in oti+yj+zk, we obtain (x14- ix2)i + (yx+iy2)j+(z±+ iz2)k; or, if we set p1 = x1i + y-J 4- zxk, P2= V+2/^'+ we obtain Pi + £/°2* We shall call this a bivector, a term which will include a vector as a particular case. When we wish to express a bivector by a single letter, we shall use the small German letters. Thus we may write X = p1 + tp2. An important case is that in which px and p2 have the same direction. The bivector may then be expressed in the form (a + ib)p, in which the vector factor, if we choose, may be a unit vector. In this case, we may say that the bivector has a real direction. In fact, if we express the bivector in the form (x14-ix2) i4-(yx4-iy2)j4-(%4-iz2) ^ the ratios of the coefficients of i, j, and Jc, which determine the direction cosines of the vector, will in this case be real. 2. The consideration that operations upon bivectors may be regarded as operations upon their biscalar x-, y- and ^-components is sufficient to show the possibility of a bivector analysis and to indicate what its rules must be. But this point of view does not afford the most simple conception of the operations which we have to perform upon bivectors. It is desirable that the definitions of the fundamental operations should be independent of such extraneous considerations as any system of axes. The various signs of our analysis, when applied to bivectors, may therefore be defined as follows, viz., The bivector equation JJL -j-IV —fX +tV implies the two vector equations fx — ¡jl", and v = v". Iy + «''] + iy'+= [m' + m”] + £ [v + v"]* O' + «/']. ifj-" + tv"] = O'- ft" — V. v"] + £ O'- v" + v. //']• O' + 'i'Jx O'' + tv"] = O' X ft" ~ vX /'] + £ O' X v"+v X ft']. * (a + ib) [¡j. + ip] = afM~bp + i[av + bfi]. [fx + Lp]{a + ib)=fxa-pb + i [fxb + va\. Therefore the position of the scalar factor is indifferent. [MS. note by author.]86 VECTOR ANALYSIS. With these definitions, a great part of the laws of vector analysis may be applied at once to bivector expressions. But an equation which is impossible in vector analysis may be possible in bivector analysis, and in general the number of roots of an equation, or of the values of a function, will be different according as we recognize, or do not recognize, imaginary values. 3. Def.—Two bivectors, or two biscalars, are said to be conjugate, when their real parts are the same, and their imaginary parts differ in sign, and in sign only. Hence, the product of the conjugates of any number of bivectors and biscalars is the conjugate of the product of the bivectors and biscalars. This is true of any kind of product. The products of a vector and its conjugate are as follows : [m+ ip]. [¡X — tv] = ¡x.y. + v.v [jm+iv]x[fi — iv]~2ivxiuL [fi + iv] — ¿i/j = {i*n + vv} + i{vix’-¡jlv). Hence, if fi and iv represent the real and imaginary parts of a bivector, the values of JUL.fJL + V.V, flXVJ fXfX -f VP, VfJL — fJLV, are not affected by multiplying the bivector by a biscalar of the form a-fife, in which a2+fe2= 1, say a cyclic scalar. Thus, if we set ¡jl+iv = (a -f ib) [/a -f «/], we shall have fx — iv = (a — ib) [¡ul — iv], and [fi 4- tv]. [/ — iv] = [m + w] • [m~w]. That is, fi. fj! -f v- v = fi. ¡jl + v. v; and so in the other cases. 4. Def.—In biscalar analysis, the product of a biscalar and* its conjugate is a positive scalar. The positive square root of this scalar is called the modulus of the biscalar. In bivector analysis, the direct product of a bivector and its conjugate is, as seen above, a positive scalar. The positive square root of this scalar may be called the modulus of the bivector. When this modulus vanishes, the bivector vanishes, and only in this case. If the bivector is multiplied by a biscalar, its modulus is multiplied by the modulus of the biscalar. The conjugate of a (real) vector is the vector itself, and the modulus of the vector is the same as its magnitude. 5. Def.—If between two vectors, a and /3, there subsists a relation of the form a = 71/3,VECTOR ANALYSIS. 87 where n is a scalar, we say that the vectors are parallel. Analogy leads us to call two bivectors parallel, when there subsists between them a relation of the form a=mf>, where m (in the most general case) is a biscalar. To aid us in comprehending the geometrical signification of this relation, we may regard the biscalar as consisting of two factors, one of which is a positive scalar (the modulus of the biscalar), and the other may be put in the form cos q +1 sin q. The effect of multiplying a bivector by a positive scalar is obvious. To understand the effect of a multiplier of the form cosg-H sing upon a bivector /ul + iv, let us set jul+iv = (cos q + i sin q) [/* -f tv]. We have then /n' — oosq ¡jl — sin q v, v = cos q v + sin q ju. Now if /jl and v are of the same magnitude and at right angles, the effect of the multiplication is evidently to rotate these vectors in their plane an angular distance q, which is to be measured in the direction from v to p. In any case we may regard p and v as the projections (by parallel lines) of two perpendicular vectors of the same length. The two last equations show that ¡jl and v will be the projections of the vectors obtained by the rotation of these perpendicular vectors in their plane through the angle q. Hence, if we construct an ellipse of which jul and v are conjugate semi-diameters, jul' and v will be another pair of conjugate semi-diameters, and the sectors between p and ¡jl and between v and v, will each be to the whole area of the ellipse as q to 27r, the sector between v and v lying on the same side of v and p, and that between jul and jul' lying on the same side of // as — v. It follows that any bivector iul+iv may be put in the form (cos q + l sin q) [a -f t/3], in which a and /3 are at right angles, being the semi-axes of the ellipse of which jul and v are conjugate semi-diameters. This ellipse we may call the directional ellipse of the bivector. In the case of a real vector, or of a vector having a real direction, it reduces to a straight line. In any other case, the angular direction from the imaginary to the real part of the bivector is to be regarded as positive in the ellipse, and the specification of the ellipse must be considered incomplete without the indication of this direction. Parallelism of bivectors, then, signifies the similarity and similar position of their directional ellipses. Similar position includes identity of the angular directions mentioned above.88 VECTOR ANALYSIS. 6. To reduce a given bivector x to the above form, we may set T.t = (cos q + i sin q)2 [ct-f + = (cos 2q-Hsin 2q) (a.a — /3./3) = a + ib, where a and b are scalars, which we may regard as known. The value of q may be determined by the equation tan 2tf = -, the quadrant to which 2q belongs being determined so as to give sin 2q and cos 2q the same signs as b and a. Then a and /3 will be given by the equation a + i/3 = (cos q — ¿ sin q)r. The solution is indeterminate when the real and imaginary parts of the given bivector are perpendicular and equal in magnitude. In this case the directional ellipse is a circle, and the bivector may be called circular. The criterion of a circular bivector is t.r = 0. It is especially to be noticed that from this equation we cannot conclude that r = 0, as in the analysis of real vectors. This may also be shown by expressing x in the form xi + yj 4- in which x, y, z are biscalars. The equation then becomes x2 + y2 + z2 = 0, which evidently does not require x, y, and 0 to vanish, as would be the case if only real values are considered. 7. Def.—We call two vectors p and | #c_1) used on page 64, was later improved by the author by the introduction of his Double Multiplication, according to which the above expression is represented by 4>2, and 14> j by 3. See this volume, pages 112, 160, and 181. For an extended treatment of Professor Gibbs’s researches on Double Multiplication in their application to Vector Analysis see pp. 306-321, and 333 of “Vector Analysis,” by E. B. Wilson, Chas. Scribner’s Sons, New York, 1901.]IV. ON MULTIPLE ALGEBRA. Address before the Section of Mathematics and Astronomy of the American Association for the Advancement of Science, by the Vice-President. [.Proceedings of the American Association for the Advancement of Science, vol. xxxv. pp. 37-66, 1886.] It has been said that “ the human mind has never invented a labor-saving machine equal to algebra.”* If this be true, it is but natural and proper that an age like our own, characterized by the multiplication of labor-saving machinery, should be distinguished by an unexampled development of this most refined and most beautiful of machines. That such has been the case, none will question. The improvement has been in every part. Even to enumerate the principal lines of advance would be a task for any one; for me an impossibility. But if we should ask, in what direction the advance has been made which is to characterize the development of algebra in our day, we may, I think, point to that broadening of its field and methods which gives us multiple algebra. Of the importance of this change in the conception of the office of algebra, it is hardly necessary to speak: that it is really characteristic of our time will be most evident if we go back some two or threescore years, to the time when the seeds were sown which are now yielding so abundant a harvest. The failure of Mobius, Hamilton, Grassmann, Saint-Yenant to make an immediate impression upon the course of mathematical thought in any way commensurate with the importance of their discoveries is the most conspicuous evidence that the times were not ripe for the methods which they sought to introduce. A satisfactory theory of the imaginary quantities of ordinary algebra, which is essentially a simple case of multiple algebra, with difficulty obtained recognition in the first third of this century. We must observe that this double algebra, as it has been called, was not sought for or invented;—it forced itself, unbidden, upon the attention of mathematicians, and with its rules already formed. * The Nation, vol. xxxiii, p. 237.92 MULTIPLE ALGEBRA. But the idea of double algebra, once received, although as it were unwillingly, must have suggested to many minds more or less distinctly the possibility of other multiple algebras, of higher orders, possessing interesting or useful properties. The application of double algebra to the geometry of the plane suggested not unnaturally to Hamilton the idea of a triple algebra which should be capable of a similar application to the geometry of three dimensions. He was unable to find a satisfactory triple algebra, but discovered at length a quadruple algebra, quaternions, which answered his purpose, thus satisfying, as he says in one of his letters, an intellectual want which had haunted him at least fifteen years. So confident was he of the value of this algebra, that the same hour he obtained permission to lay his discovery before the Royal Irish Academy, which he did on November 13, 1843* This system of multiple algebra is far better known than any other, except the ordinary double algebra of imaginary quantities,—far too well known to require any especial notice at my hands. All that here requires our attention is the close historical connection between the imaginaries of ordinary algebra and Hamilton’s system, a fact emphasized by Hamilton himself and most writers on quaternions. It was quite otherwise with Mobius and Grassmann. The point of departure of the Barycentrischer Galcul of Mobius, published in 1827,—a work of which Clebsch has said that it can never be admired enough,!—is the use of equations in which the terms consist of letters representing points with numerical coefficients, to express barycentric relations between the points. Thus, that the point 8 is the centre of gravity of weights, a, 6, c, d, placed at the points A, B, C, D, respectively, is expressed by the equation (a -h b -f- c c?) 8—a A -J- bB -f- cG-\- dD. An equation of the more general form a A + bB-{-cG+ etc. =pP + qQ+rR + etc. signifies that the weights a, b, c, etc., at the points A, B} G, etc., have the same sum and the same centre of gravity as the weights p, q, r, etc., at the points P, Q, R, etc., or, in other words, that the former are barycentrically equivalent to the latter. Such equations, of which each represents four ordinary equations, may evidently be multiplied or divided by scalars,! may be added or subtracted, and may have * Phil. Mag. (3), vol. xxv, p. 490; North British Revievu, vol. xlv (1866), p. 57. fSee his eulogy on Plticker, p. 14, Gótt. Abhandl., vol. xvi. X I use this term in Hamilton’s sense, to denote the ordinary positive and negative quantities of algebra. It may, however, be observed that in most cases in which I shall have occasion to use it, the proposition would hold without exclusion of imaginary quantities,—that this exclusion is generally for simplicity and not from necessity.MULTIPLE ALGEBRA. 93 their terms arranged and transposed, exactly like the ordinary equations of algebra. It follows that the elimination of letters representing points from equations of this kind is performed by the rules of ordinary algebra. This is evidently the beginning of a quadruple algebra, and is identical, as far as it goes, with Grassmann’s marvellous geometrical algebra. In the same work we find also, for the first time so far as I am aware, the distinction of positive and negative consistently carried out on the designation of segments of lines, of triangles, and of tetra-hedra, viz., that a change in place of two letters, in such expressions as AB, ABC, ABCD, is equivalent to prefixing the negative sign. It is impossible to overestimate the importance of this step, which gives to designations of this kind the generality and precision of algebra. Moreover, if A, By C are three points in the same straight line, and D any point outside of that line, the author observes that we have AB+BC+CA = 0, and also, with D prefixed, DAB+DBC+DCA = 0. Again, if A, B, G, D are four points in the same plane, and E any point outside of that plane, we have ABC-BCD+CD A - DAB=0, and also, with E prefixed, EABC — EBCD+EC DA-EDAB=0. The similarity to multiplication in the derivation of these formulae cannot have escaped the author’s notice. Yet he does not seem to have been able to generalize these processes. It was reserved for the genius of Grassmann to see that AB might be regarded as the product of A and B, DAB as the product of D and AB, and EABC as the product of E and ABC. That Mobius could not make this step was evidently due to the fact that he had not the conception of the addition of other multiple quantities than such as may be represented by masses situated at points. Even the addition of vectors (i.e., the fact that the composition of directed lines could be treated as an addition) seems to have been unknown to him at this time, although he subsequently discovered it, and used it in his Mechanik des Himmels, which was published in 1843. This addition of vectors, or geometrical addition, seems to have occurred independently to many persons. Seventeen years after the Barycentrischer Calcul, in 1844, the year in which Hamilton’s first papers on quaternions appeared in print, Grassmann published his Lineale Ausdehnungslehre, in which he developed the idea and the properties of the external or combinatorial94 MULTIPLE ALGEBRA. product, a conception which is perhaps to be regarded as the greatest monument of the author's genius. This volume was to have been followed by another, of the nature of which some intimation was given in the preface and in the work itself. We are especially told that the internal product ,* which for vectors is identical except in sign with the scalar part of Hamilton’s product (just as Grassmann’s external product of two vectors is practically identical with the vector part of Hamilton’s product), and the open product,t which in the language of to-day would be called a matrix, were to be treated in the second volume. But both the internal product of vectors and the open product are clearly defined, and their fundamental properties indicated, in this first volume. This remarkable work remained unnoticed for more than twenty years, a fact which was doubtless due in part to the very abstract and philosophical manner in which the subject was presented. In consequence of this neglect the author changed his plan, and instead of a supplementary volume published in 1862 a single volume entitled Ausdehnungslehre, in which were treated, in an entirely different style, the same topics as in the first volume, as well as those which he had reserved for the second. Deferring for the moment the discussion of these topics in order to follow the course of events, we find in the year following the first Ausdehnungslehre a remarkable memoir of Saint-Venant t, in which are clearly described the addition both of vectors and of oriented areas, the differentiation of these with respect to a scalar quantity, and a multiplication of two vectors and of a vector and an oriented area. These multiplications, called by the author geometrical, are entirely identical with Grassmann’s external multiplication of the same quantities. It is a striking fact in the history of the subject, that the short period of less than two years was marked by the appearance of well-developed and valuable systems of multiple algebra by British, German, and French authors, working apparently entirely independently of one another. No system of multiple algebra had appeared before, so far as I know, except such as were confined to additive processes with multiplication by scalars, or related to the ordinary double algebra of imaginary quantities. But the appearance of a single one of these systems would have been sufficient to mark an epoch, perhaps the most important epoch in the history of the subject. In 1853 and 1854, Cauchy published several memoirs on what he called clefs algébriques.§ These were units subject generally to * See the preface. t See § 17*2. X Comptes Rendus, vol. xxi, p. 620. § Comptes Rendus, vols, xxxvi, ff.MULTIPLE ALGEBRA. 95 combinatorial multiplication. His principal application was to the theory of elimination. In this application, as in the law of multiplication, he had been anticipated by Grassmann. We come next to Cayley’s celebrated Memoir on the Theory of Matrices * in 1858, of which Sylvester has said that it seems to him to have ushered in the reign, of Algebra the Second, t I quote this dictum of a master as showing his opinion of the importance of the subject and of the memoir. But the foundations of the theory of matrices, regarded as multiple quantities, seem to me to have been already laid in the Ausdehnungslehre of 1844. To Grassmann’s treatment of this subject we shall recur later. After the Ausdehnungslehre of 1862, already mentioned, we come to Hankel’s Vorlesungen uber die complexen Zahlen, 1867. Under this title the author treats of the imaginary quantities of ordinary algebra, of what he calls alternirende Zahlen, and of quaternions. These alternate numbers, like Cauchy’s clefs, are quantities subject to Grassmann’s law of combinatorial multiplication. This treatise, published twenty-three years after the first Ausdehnungslehre, marks the first impression which we can discover of Grassmann’s ideas upon the course of mathematical thought. The transcendent importance of these ideas was fully appreciated by the author, whose very able work seems to have had considerable influence in calling the attention of mathematicians to the subject. In 1870, Professor Benjamin Peirce published his Linear Associative Algebra, subsequently developed and enriched by his son, Professor C. S. Peirce. The fact that the edition was lithographed seems to indicate that even at this late date a work of this kind could only be regarded as addressed to a limited number of readers. But the increasing interest in such subjects is shown by the republication of this memoir in 1881, J as by that of the first Ausdehnungslehre in 1878. The article on quaternions which has just appeared in the Encyclopaedia Britannica mentions twelve treatises, including second editions and translations, besides the original treatises of Hamilton. That all the twelve are later than 1861 and all but two later than 1872 shows the rapid increase of interest in this subject in the last years. Finally, we arrive at the Lectures on the Principles of Universal Algebra by the distinguished foreigner whose sojourn among us has given such an impulse to mathematical study in this country. The publication of these lectures, commenced in 1884 in the American Journal of Mathematics, has not as yet been completed,—a want but imperfectly supplied by the author’s somewhat desultory publication * Phil. Trans., vol. cxlviii. + Amer. Joum. Math., vol. vi, p. 271. %Amer. Joum. Math., vol. iv.96 MULTIPLE ALGEBRA. of many remarkable papers on the same subject (which might be more definitely expressed as the algebra of matrices) in various foreign journals. It is not an accident that this century has seen the rise of multiple algebra. The course of the development of ideas in algebra and in geometry, although in the main independent of any aid from this source, has nevertheless to a very large extent been of a character which can only find its natural expression in multiple algebra. Our Modern Higher Algebra is especially occupied with the theory of linear transformations. Now what are the first notions which we meet in this theory ? We have a set of n variables, say x, y, and another set, say c, y\ z\ which are homogeneous linear functions of the first, and therefore expressible in terms of them by means of a block of n2 coefficients. Here the quantities occur by sets, and invite the notations of multiple algebra. It was in fact shown by Grass-mann in his first A usdehnungslehre and by Cauchy nine years later, that the notations of multiple algebra afford a natural key to the subject of elimination. Now I do not merely mean that we may save a little time or space by writing perhaps p for x, y and 0; p for x\ y' and z'; and for a block of n2 quantities. But I mean that the subject as usually treated under the title of determinants has a stunted and misdirected development on account of the limitations of single algebra. This will appear from a very simple illustration. After a little preliminary matter the student comes generally to a chapter entitled “ Multiplication of Determinants,” in which he is taught that the product of the determinants of two matrices may be found by performing a somewhat lengthy operation on the two matrices, by which he obtains a third matrix, and then taking the determinant of this. But what significance, what value has this theorem ? For aught that appears in the majority of treatises which I have seen, we have only a complicated and lengthy way of performing a simple operation. The real facts of the case may be stated as follows: Suppose the set of n quantities p to be derived from the set p by the matrix ; it is evident that p" can be derived from p by the operation of a single matrix, say 0, i.e., so that p'=e.P, e=^.#.MULTIPLE ALGEBRA. 97 In the language of multiple algebra 0 is called the product of ‘Sfr and 3?. It is of course interesting to see how it is derived from the latter, and it is little more than a schoolboy’s exercise to determine this. Now this matrix 0 has the property that its determinant is equal to the products of the determinants of Sk and §>. And this property is all that is generally stated in the books, and the fundamental property, which is all that gives the subject its interest, that 0 is itself the product of and 3? in the language of multiple algebra, i.e., that operating by 0 is equivalent to operating successively by and is generally omitted. The chapter on this subject, in most treatises which I have seen, reads very like the play of Hamlet with Hamlet’s part left out. And what is the cause of this omission ? Certainly not ignorance of the property in question. The fact that it is occasionally given would be a sufficient bar to this answer. It is because the author fails to see that his real subject is matrices and not determinants. Of course, in a certain sense, the author has a right to choose his subject. But this does not mean that the choice is unimportant, or that it should be determined by chance or by caprice. The problem well put is half solved, as we all know. If one chooses the subject ill, it will develop itself in a cramped manner. But the case is really much worse than I have stated it. Not only is the true significance of the formation of 0 from “SR and # not given, but the student is often not taught to form the matrix which is the product of 'Sr and 1, f are the reciprocals of the segments, and x, y, a are the coordinates of any point in the plane. Now if we set p=gx+tiy + £z, (2) this letter will represent an expression which represents the plane. In fact, we may say that p implicitly contains £, q, and which are the coordinates of the plane. We may therefore speak of the plane p, and for many purposes can introduce the letter p into our equations instead of £ q, £ For example, the equation P = P'+P" is equivalent to the three equations 6>"-% + £” n’+n" ,_r+r (3) 100 MULTIPLE ALGEBRA. It is to be noticed that on account of the indeterminateness of the x, y, and 0, this method, regarded as an analytical artifice, is identical with that of Lagrange, also that in multiple algebra we should have an equation of precisely the same form as (3) to express the same relation between the planes, but that the equation would be explained to the student in a totally different manner. This we shall see more particularly hereafter. It is curious that we have thus a simpler notation for a plane than for a point. This however may be reversed. If we commence with the notion of the coordinates of a plane, £ t}, the equation of a point (i.e., the equation between £ rj, f which will hold for every plane passing through the point) will be xi+yv+H^ i> (5) where cc, y, 0 are the coordinates of the point. Now if we set g = (6) we may regard the single letter q as representing the point, and use it, in many cases, instead of the coordinates x, y, 0, which indeed it implicitly contains. Thus we may write q “ 2 for the three equations oo x'+x" y = _y+v" 0'+z" (8) Here, by an analytical artifice, we come to equations identical in form and meaning with those used by Hamilton, Grassmann, and even by Mobius in 1827. But the explanations of the formulae would differ widely. The methods of the founders of multiple algebra are characterized by a bold simplicity, that of the modern geometry by a somewhat bewildering ingenuity. That p and q represent the same expression (in one case x, y, 0, and in the other £ r}> f being indeterminate) is a circumstance which may easily become perplexing. I am not quite certain that it would be convenient to use both of these abridged notations at the same time. In fact, if the geometer using these methods were asked to express by an equation in p and q that the point q lies in the plane p> he might find himself somewhat entangled in the meshes of his own ingenuity, and need some new artifice to extricate himself. I do not mean that his genius might not possibly be equal to the occasion, but I do mean very seriously that it is a vicious method which requires any ingenuity or any artifice to express so simple a relation. If we use the methods of multiple algebra which are most comparable to those just described, a point is naturally represented by a vector (p) drawn to it from the origin, a plane by a vector (or) drawnMULTIPLE ALGEBRA. 101 from the origin perpendicularly toward the plane and in length equal to the reciprocal of the distance of the plane from the origin. The eq nation / . // ^Z+Z. (9) will have precisely the same meaning as equation (3), and 0"'=P±R P 2 (10) will have precisely the same meaning as equation (7), viz., that the point p" is in the middle between p and p". That the point p lies in the plane are the distances of the plane from the four points, and x, yt z, w are the coordinates of any point in the plane. Here we may set P=£a+iiy+£z+&y>, (15) and say that p represents the plane. To some extent we can introduce this letter into equations instead of £ rj, w. Thus the equation Ip'+mp"+np"' = 0 (16) (which denotes that the planes p', p", p"\ meet in a common line, making angles of which the sines are proportional to l> m, and n) is equivalent to the four equations l?+mg’+ng" = 0, fy+mqf,+nf' = 0, etc. (17) Again, we may regard £ rj, £, co as the coordinates of a plane, equation of a point will then be x £+ yrj+0 f-f ww = 0. If we set q=x £+ y rj+2; w co, The (18) (19) we may say that q represents the point. The equation <Ì -g'+g* “ 2 J (20) which indicates that the point q'" bisects the line between q' and q", is equivalent to the four equations ,, r+r S 2 ’ (21) To express that the point q lies in the plane p does not seem easy, without going back to the use of coordinates. The form of multiple algebra which is to be compared to this is the geometrical algebra of Möbius and Grassmann, in which points without reference to any origin are represented by single letters, say by Italic capitals, and planes may also be represented by single letters, say by Greek capitals. An equation like q„,=£+£'; (22) has exactly the same meaning as equation (20) of ordinary algebra. So m'+mn"+™n'"=o (23) has precisely the same meaning as equation (16) of ordinary algebra. That the point Q lies in the plane II is expressed by equating to zero the product of Q and n which is called by Grassmann external and which might be defined as the distance of the point from the plane. We may write this Qxn=o. (24)MULTIPLE ALGEBRA. 103 To show that so simple an expression is really amenable to analytical treatment, I observe that Q may be expressed in terms of any four points (not in the same plane) on the barycentric principle explained above, viz., Q=xA+yB+zC+wD, (25) and II may be expressed in terms of combinatorial products of A, B, Gy and D, viz., IL = gBxCxD+t)CxAxD+gDxAxB+coAxCxBy (26) and by these substitutions, by the laws of the combinatorial product to be mentioned hereafter, equation (24) is transformed into W(0+xg+yt]+z£=Oy (27) which is identical with the formula of ordinary analysis.* I have gone at length into this very simple point in order to illustrate the fact, which I think is a general one, that the modem geometry is not only tending to results which are appropriately expressed in multiple algebra, but that it is actually striving to clothe itself in forms which are remarkably similar to the notations of multiple algebra, only less simple and general and far less amenable to analytical treatment, and therefore, that a certain logical necessity calls for throwing off the yoke under which analytical geometry has so long labored. And lest this should seem to be the utterance of an uninformed enthusiasm, or the echoing of the possibly exaggerated claims of the devotees of a particular branch of mathematical study, I will quote a sentence from Clebsch and one from Clifford, relating to the past and to the future of multiple algebra. The former in his eulogy on Plucker,t in 1871, speaking of recent advances in geometry, says that “in a certain sense the coordinates of a straight line, and in general a great part of the fundamental conceptions .of the newer algebra, are contained in the Ausdehnungslehre of 1844,” and Clifford i in the last year of his life, speaking of the Ansdehn-ungslehrey with which he had but recently become acquainted, expresses “his profound admiration of that extraordinary work, and his conviction that its principles will exercise a vast influence upon the future of mathematical science.” Another subject in which we find a tendency toward the forms and methods of multiple algebra, is the calculus of operations. Our ordinary analysis introduces operators, and the successive operations A and B may be equivalent to the operation C. To express this in an equation we may write BA(x) = G(x), * The letters £, 17, f, w, here denote the distances of the plane II from the points Ay By Gy Dy divided by six times the volume of the tetrahedron, Ay B, Gy D. The letters x, y, z, w, denote the tetrahedral coordinates as above. JcGott. AbhancU.y vol. xvi, p. 28. %Amer. Journ. Math., vol. i, p. 350.104 MULTIPLE ALGEBRA. where x is any quantity or function, write A (x)+B (x) — D(x)f or We may also have occasion to (A+B)(x) = D(x). But it is almost impossible to resist the tendency to express these relations in the form BA = C, A+B = D, in which the operators appear in a sense as quantities, i.e., as subjects of functional operation. Now since these operators are often of such nature that they cannot be perfectly specified by a single numerical quantity, when we treat them as quantities they must be regarded as multiple quantities. In this way certain formulas which essentially belong to multiple algebra get a precarious footing where they are only allowed because they are regarded as abridged notations for equations in ordinary algebra. Yet the logical development of such notations would lead a good way in multiple algebra, and doubtless many investigators have entered the field from this side. One might also notice, to show how the ordinary algebra is becoming saturated with the notions and notations which seem destined to turn it into a multiple algebra, the notation so common in the higher algebra (a> b> c)(Xt y> z) for ax+by + cz. This is evidently the same as Grassmann’s internal product of the multiple quantities (a, b, c) and (x, y, z), or, in the language of quaternions, the scalar part, taken negatively, of the product of the vectors of which a, b, e and x, y, z are the components. A similar correspondence with Grassmann’s methods might, I think, be shown in such notations as, for example, (a, 6, c, d)(x, yf. The free admission of such notations is doubtless due to the fact that they are regarded simply as abridged notations. The author of the celebrated “ Memoir on the Theory of Matrices ” goes much farther than this in his use of the forms of multiple algebra. Thus he writes explicitly one equation to stand for several, without the use of any of the analytical artifices which have been mentioned. This work has indeed, as we have seen, been characterized as marking the commencement of multiple algebra,—a view to which we can only take exception as not doing justice to earlier writers. But the significance of this memoir with regard to the point which I am now considering is that it shows that the chasm so marked in the second quarter of this century is destined to be closed up. Notions and notations for which a Cayley is sponsor will not beMULTIPLE ALGEBKA. 105 excluded from good society among mathematicians. And if we admit as suitable the notations used in this memoir (where it is noticeable that the author rather avoids multiple algebra, and only uses it very sparingly), we shall logically be brought to use a great deal more. For example, if it is a good thing to write in our equations a single letter to represent a matrix of n2 numerical quantities, why not use a single letter to represent the n quantities operated upon, as Grass-mann and Hamilton have done? Logical consistency seems to demand it. And if we may use the sign )( to denote an operation by which two sets of quantities are combined to form a third set, as is the case in this memoir, why not use other signs to denote other functional operations of which the result is a multiple quantity? If it be conceded that this is the proper method to follow where simplicity of conception, or brevity of expression, or ease of transformation is served thereby, our algebra will become in large part a multiple algebra. We have considered the subject a good while from the outside; we have glanced at the principal events in the history of multiple algebra; we have seen how the course of modern thought seems to demand its aid, how it is actually leaning toward it, and beginning to adopt its methods. It may be worth while to direct our attention more critically to multiple algebra itself, and inquire into its essential character and its most important principles. I do not know that anything useful or interesting, which relates to multiple quantity, and can be symbolically expressed, falls outside of the domain of multiple algebra. But if it is asked, what notions are to be regarded as fundamental, we must answer, here as elsewhere, those which are most simple and fruitful. Unquestionably, no relations are more so than those which are known by the names of addition and multiplication. Perhaps I should here notice the essentially different manner in which the multiplication of multiple quantities has been viewed by different writers. Some, as Hamilton, or De Morgan, or Peirce, speak of the product of two multiple quantities, as if only one product could exist, at least in the same algebra. Others, as Grassmann, speak of various kinds of products for the same multiple quantities. Thus Hamilton seems for many years to have agitated the question, what he should regard as the product of each pair of a set of triplets, or in the geometrical application of the subject, what he should regard as the product of each pair of a system of perpendicular directed lines* Grassmann asks, What products, i.e., what distributive functions of the multiple quantities, are most important ? * Phil. Mag. (3), vol. xxv, p. 490; North British Review, vol. xlv, (1866), p. 57.106 MULTIPLE ALGEBRA. It may be that in some cases the fact that only one kind of product is known in ordinary algebra has led those to whom the problem presented itself in the form of finding a new algebra to adopt this characteristic derived from the old. Perhaps the reason lies deeper in a distinction like that in arithmetic between concrete and abstract numbers or quantities. The multiple quantities corresponding to concrete quantities such as ten apples or three miles are evidently such combinations as ten apples + seven oranges, three miles northward + five miles eastward, or six miles in a direction fifty degrees east of north. Such are the fundamental multiple quantities from Grassmann’s point of view. But if we ask what it is in multiple algebra which corresponds to an abstract number like twelve, which is essentially an operator, which changes one mile into twelve miles, and $1,000 into $12,000, the most general answer would evidently be: an operator which will work such changes as, for example, that of ten apples + seven oranges into fifty apples + 100 oranges, or that of one vector into another. Now an operator has, of course, one characteristic relation, viz., its relation to the operand. This needs no especial definition, since it is contained in the definition of the operator. If the operation is distributive, it may not inappropriately be called multiplication, and the result is par excellence the product of the operator and operand. The sum of operators qud operators, is an operator which gives for the product the sum of the products given by the operators to be added. The product of two operators is an operator which is equivalent to the successive operations of the factors. This multiplication is necessarily associative, and its definition is not really different from that of the operators themselves. And here I may observe that Professor C. S. Peirce has shown that his father's associative algebras may be regarded as operational and matricular.* Now the calculus of distributive operators is a subject of great extent and importance, but Grassmann’s view is the more comprehensive, since it embraces the other with something besides. For every quantitative operator may be regarded as a quantity, i.e., as the subject of mathematical operation, but every quantity cannot be regarded as an operator; precisely as in grammar every verb may be taken as substantive, as in the infinitive, while every substantive does not give us a verb. Grassmann’s view seems also the most practical and convenient. For we often use many functions of the same pair of multiple quantities, which are distributive with respect to both, and we need some simple designation to indicate a property of such fundamental * Amer. Journ. Math., voi. iv, p. 221.MULTIPLE ALGEBRA. 107 importance in the algebra of such functions, and no advantage appears in singling out a particular function to be alone called the product. Even in quaternions, where Hamilton speaks of only one product of two vectors (regarding it as a special case of the product of quaternions, i.e., of operators), he nevertheless comes to use the scalar part of this product and the vector part separately. Now the distributive law is satisfied by each of these, which therefore may conveniently be called products. In this sense we have three kinds of products of vectors in Hamilton's analysis. Let us then adopt the more general view of multiplication, and call any function of two or more multiple quantities, which is distributive with respect to all, a product, with only this limitation, that when one of the factors is simply an ordinary algebraic quantity, its effect is to be taken in the ordinary sense. It is to be observed that this definition of multiplication implies that we have an addition both of the kind of quantity to which the product belongs, and of the kinds to which the factors belong. Of course, these must be subject to the general formal laws of addition. I do not know that it is necessary for the purposes of a general discussion to stop to define these operations more particularly, either on their own account or to complete the definition of multiplication. Algebra, as a formal science, may rest on a purely formal foundation. To take our illustration again from mechanics, we may say that if a man is inventing a particular machine,—a sewing machine, a reaper,—nothing is more important than that he should have a precise idea of the operation which his machine is to perform, yet when he is treating the general principles of mechanics he may discuss the lever, or the form of the teeth of wheels which will transmit uniform motion, without inquiring the purpose to which the apparatus is to be applied; and in like manner that if we were forming a particular algebra,—a geometrical algebra, a mechanical algebra, an algebra for the theory of elimination and substitution, an algebra for the study of quantics,—we should commence by asking, What are the multiple quantities, or sets of quantities, which we have to consider ? What are the additive relations between them ? What are the multiplicative relations between them ? etc., forming a perfectly defined and complete idea of these relations as we go along-; but in the development of a general algebra no such definiteness of conception is requisite. Given only the purely formal law of the distributive character of multiplication,—this is sufficient for the foundation of a science. Nor will such a science be merely a pastime for an ingenious mind. It will serve a thousand purposes in the formation of particular algebras. Perhaps we shall find that in the most important108 MULTIPLE ALGEBRA. cases the particular algebra is little more than an application or interpretation of the general. Grassmann observes that any kind of multiplication of 'ft-fold quantities is characterized by the relations which hold between the products of n independent units. In certain kinds of multiplication these characteristic relations will hold true of the products of any of the quantities. Thus if the value of a product is independent of the order of the factors when these belong to the system of units, it will always be independent of the order of the factors. The kind of multiplication characterized by this relation and no other between the products is called by Grassmann algebraic, because its rules coincide with those of ordinary algebra. It is to be observed, however, that it gives rise to multiple quantities of higher orders. If n independent units are required to express the original quantities, n^^- units wili be required for the products of two factors, for the products of three factors, etc. Again, if the value of a product of factors belonging to a system of units is multiplied by —1 when two factors change places, the same will be true of the product of any factors obtained by addition of the units. The kind of multiplication characterized by this relation and no other is called by Grassmann external or combinatorial. For our present purpose we may denote it by the sign x. It gives rise to multiple quantities of higher orders, n n—1 ~2 units being required to express the products of two factors, n (n — l)(n — 2) 278 ” units for products of three factors, etc. All products of more than n factors are zero. The products of n factors may be expressed by a single unit, viz., the product of the n original units taken in a specified order, which is generally set equal to 1. The products of n — 1 factors are expressed in terms of n units, those of n — 2 factors in terms of n n — 1 _____ units, etc. This kind of multiplication is associative, like the algebraic. Grassmann observes, with respect to binary products, that these two kinds of multiplication are the only kinds characterized by laws which are the same for any factors as for particular units, except indeed that characterized by no special laws, and that for which all products are zero* The last we may evidently reject as nugatory. That for which there are no special laws, i.e., in which no equations Crelle’s Jonm. f. Math., voi. xlix, p. 138.MULTIPLE ALGEBRA. 109 subsist between the products of a system of independent units, is also rejected by Grassmann, as not appearing to afford important applications. I shall, however, have occasion to speak of it, and shall call it the indeterminate product. In this kind of multiplication, n2 units are required to express the products of two factors, and nz units for products of three factors, etc. It evidently may be regarded as associative. Another very important kind of multiplication is that called by Grassmann internal. In the form in which I shall give it, which is less general than Grassmann’s, it is in one respect the most simple of all, since its only result is a numerical quantity. It is essentially binary and characterized by laws of the form ¿.¿ = 1, j.j = 1, k.k = 1, etc., i.j = 0, j.i — 0, etc., where if j, k, etc., represent a system of independent units. I use the dot as significant of this kind of multiplication. Grassmann derives this kind of multiplication from the combinatorial by the following process. He defines the complement (Ergänzung) of a unit as the combinatorial product of all the other units, taken with such a sign that the combinatorial product of the unit and its complement shall be positive. The combinatorial product of a unit and its complement is therefore unity, and that of a unit and the complement of any other unit is zero. The internal product of two units is the combinatorial product of the first and the complement of the second. It is important to observe that any scalar product of two factors of the same kind of multiple quantities, which is positive when the factors are identical, may be regarded as an internal product, i.e., we may always find such a system of units, that the characteristic equations of the product will reduce to the above form. The nature of the subject may afford a definition of the product independent of any reference to a system of units. Such a definition will then have obvious advantages. An important case of this kind occurs in geometry in that product of two vectors which is obtained by multiplying the products of their lengths by the cosine of the angle which they include. This is an internal product in Grassmann’s sense. Let us now return to the indeterminate product, which I am inclined to regard as the most important of all, since we may derive from it the algebraic and the combinatorial. For this end we will prefix 2 to an indeterminate product to denote the sum of all the terms obtained by taking the factors in every possible order. Then, 2«li8|y, for instance, where the vertical line is used to denote the110 MULTIPLE ALGEBRA. indeterminate product,* is a distributive function of a, ¡3, and y. It is evidently not affected by changing the order of the letters. It is, therefore, an algebraic product in the sense in which the term has been defined. So, again, if we prefix to an indeterminate product to denote the sum of all terms obtained by giving the factors every possible order, those terms being taken negatively which are obtained by an odd number of simple permutations, 2±a|/3|y, for instance, will be a distributive function of a, (3, y, which is multiplied by —1 when two of these letters change places. It will therefore be a combinatorial product. It is a characteristic and very important property of an indeterminate product that every product of all its factors with any other quantities is also a product of the indeterminate product and the other quantities. We need not stop for a formal proof of this proposition, which indeed is an immediate consequence of the definitions of the terms. These considerations bring us naturally to what Grassmann calls regressive multiplication, which I will first illustrate by a very simple example. If n, the degree of multiplicity of our original quantities, is 4, the combinatorial product of ax/3xy and Sxe, viz., aX/3xyxSxe, is necessarily zero, since the number of factors exceeds four. But if for Sxe we set its equivalent (!) where A — Ti a — t3 1 n+Ta’ 3 n+r,’ ■Bl = ITT ( — Tx2 + TjTg + r82), B2 = tV(Tl2 + 3tIT3 + Ts) B, = tV(Ti2 + TjT3 - t82). This we shall call our fundamental equation. In order to discuss its geometrical signification, let us set «i“A(1+§). ni = i}-^pp (2) so that the equation will read nx SRj — n2 -f nz %=0. (3) This expresses that the vector n2$i2 is the diagonal of a parallelogram of which and are sides. If we multiply by and by SRj, in shew multiplication, we get nx% x% - n2%x% = 0, — +8=5 °> (4) whence -^^3=^iX^3== ^xjgg (5) Tii n2 n&DETERMINATION OF ELLIPTIC ORBITS. 121 Our equation may therefore be regarded as signifying that the three vectors 9$2, 9?s lie in one plane, and that the three triangles determined each by a pair of these vectors, and usually denoted by [r2r3], [rxr3], [r^], are proportional to Since this vector equation is equivalent to three ordinary equations, it is evidently sufficient to determine the three positions of the body in connection with the conditions that these positions must lie upon the lines of sight of three observations. To give analytical expression to these conditions, we may write ®3 f°r the vectors drawn from the sun to the three positions of the earth (or, more exactly, of the observatories where the observations have been made), ^2, for unit vectors drawn in the directions of the body, as observed, and pv p2, p3 for the three distances of the body from the places of observation. We have then — + 9^2 :=&2 + P2$2> ^3==®3 + P353. (6) By substitution of these values our fundamental equation becomes a(i +Pi&) - (l - |f)«s2+p2S2)+¿3(1+§)(@3+A&)=0, (7) where plf p2, p3, r19 r2, rs (the geocentric and heliocentric distances) are the only unknown quantities. From equations (6) we also get, by squaring both members in each, ri =&i+ 2(®1-01)p1+p,2, r^ = g2H2((g2.52)p2+^) r* = Qi*+2(%$s)p3+p*, j W by which the values of rlt r2, r3 may be derived from those of Pi> P2> Ps> or V^ce vers&- Equations (7) and (8), which are equivalent to six ordinary equations, are sufficient to determine the six quantities ri> r2>9*3’ Pi> P2’ Pd j or> ^ we suppose the values of rx, r2, rz in terms of pv p2, ps be substituted in equation (7), we have a single vector equation, from which we may determine the three geocentric distances Pl> P2> Pd' It remains to be shown, first, how the numerical solution of the equation may be performed, and secondly, how such an approximate solution of the actual problem may furnish the basis of a closer approximation. Solution of the Fundamental Equation. The relations with which we have to do will be rendered a little more simple if instead of each geocentric distance we introduce the122 DETERMINATION OF ELLIPTIC ORBITS. distance of the body from the foot of the perpendicular from the sun upon the line of sight. If we set g2 = p2+(@2.02), ?3 = /33 + (®3-53). (9) ft2=6,!-(W. Fb=<582-((5A)2, (10) equations (8) become »'i2=?i2+i>i2. r22 = +P-22’ rS2 = g,32+2J82 (11) Let us also set, for brevity, ®Wi(i+^)(^+piU ©2= -(*-£})(.<**+M @3=^3(i+^)((S3+/3388). (12) Then ©1? ©2, @3 may be regarded as functions respectively of Pi> P2’ P3> therefore of ft, ft, ft, and if we set d®, d<& 2 d© o dqt dq2 dqz (13) and © = @, + ©2+®,, (14) we shall have d © = ©' dqx + ©" dft+©'" dqB. (15) To determine the value of ©', we get by differentiation But by (11) d(h r'\ (17) Therefore @"= fl "®2W+ @ V r2V^2+r25(l-Ä,r2-3)^2 3,©3 3\ r33/ 3 r35(l+i?3r3-3) 3J (18) Now if any values of ft, ft, q3 (either assumed or obtained by a previous approximation) give a certain residual © (which would be zero if the values of ft, ft, ft satisfied the fundamental equation), and we wish to find the corrections Aft, Ag2,oAft, which must be added to ql9 ft, ft to reduce the residual to zero, we may apply equation (15) to these finite differences, and will have approximately, when these differences are not very large, - © = ©'Aft + ©"Aft + ©"'Aft. (19)DETERMINATION OF ELLIPTIC ORBITS. 123 This gives * ^i=~ Ag2= - (©S'"©') A?3=- (©©'©") (20) From the corrected values of qlt q2, g3 we may calculate a new residual ©, and from that determine another correction for each of the quantities It will sometimes be worth while to use formulas a little less simple for the sake of a more rapid approximation. Instead of equation (19) we may write, with a higher degree of accuracy, -© = ©,Ag1+@,/Ag2+©',/Ag3+|r(Ag1)2+mAg2)2+jr,(Ag3)2,(21) where %' =<^ = 2A1B1d(r^ 2" = d

  • 2 r' = ^ = 2^a5a^—' g8+ <%2' d(r2~3) dq2 <^(n~3) 82- -B, d2(r, ^c, dqi ________^(^f8) 1 +-B3r3-3 dg32 -B. ©, (22) It is evident that X" is generally many times greater than X' or X"\ the factor B2, in the case of equal intervals, being exactly ten times as great as A1B1 or AZBZ. This shows, in the first place, that the accurate determination of Aq2 is of the most importance for the subsequent approximations. It also shows that we may attain nearly the same accuracy in writing -© = ©'Aft* ©"Ag2+©'"A&+iX"Aq*. (23) We may, however, often do a little better than this without using a more complicated equation. For %'+%”' may be estimated very roughly as equal to \X". Whenever, therefore, Aqx and Aqz are about as large as Ag2, as is often the case, it may be a little better to use the coefficient ttf instead of £ in the last term. For Aq2, then, we have the equation - (©©"'©') = (S'©"©"') A q2+TV(X/,©//'©') Ag22. (g"©"'©') is easily computed from the formula (2"©'"©')=-(1- 5—2) ( (©'©"©'")+(3a@"'©')y Q 9. ^9 ' ' / (24) (25) which may be derived from equations (18) and (22). * These equations are obtained by taking the direct products of both members of the preceding equation with ©" x <§>'" x <&', and x respectively. See footnote on page 119.124 DETERMINATION OF ELLIPTIC ORBITS. The quadratic equation (24) gives two values of the correction to be applied to the position of the body. When they are not too large, they will belong to two different solutions of the problem, generally to the two least removed from the values assumed. But a very large value of Aq2 must not be regarded as affording any trustworthy indication of a solution of the problem. In the majority of cases we only care for one of the roots of the equation, which is distinguished by being very small, and which will be most easily calculated by a small correction to the value which we get by neglecting the quadratic term.* * When a comet is somewhat near the earth we may make use of the fact that the earth’s orbit is one solution of the problem, i.e., that — p2 is one value of Aq2, to save the trifling labor of computing the value of (ST'©"'©')- For it is evident from the theory of equations that if — p2 and 0 are the two roots, _(©'©"©'") (©©"'©') Pi z~ P2Z~ %(%"<&"'@0' Eliminating (£"©'"©'), we have (/O2-0)(©©"/@')= -Z***®'®"®'"), whence 1_ 1 (©'©"©"') 0~>2 (©@'"©y Now — is the value of Aq2i which we obtain if we neglect the quadratic term in equation (24). If we call this value [Ag2], we have for the more exact value t Ag2~l 1 (-26) Pi The. quantities Aq1 and Aqs might be calculated by the equations _(@@" q3, it will not be necessary to recalculate the values of ©', 0", 0'", when these have been calculated from fairly good values of qv q2, qz. But when, as is generally the case, the first assumption is only a rude guess, the values of 0', 0", ©'" should be recalculated after one or two corrections of qv q2, qz. To get the best results when we do not recalculate 0', 0", 0'", we may proceed as follows: Let ©', 0", ©'" denote the values which have been calculated; Dqv Dq2, Dqs, respectively, the sum of the corrections of each of the quantities qv q2, q3, which have been made since the calculation of ©', 0", S'"; <§> the residual after all the corrections of qlt q2i qz, which have been made; and Aq19 Aq2> Aqz the remaining corrections which we are seeking. We have, then, very nearly -®={& + X'(Dq1+iAql)}Aq1+{®"+X"(Dq2 + iAq2)}AqJ g) + {&"+X"'(2)q8 + iAq3)}Aq3. I The same considerations which we applied to equation (21) enable us to simplify this equation also, and to write with a fair degree of accuracy -(©©'"0') = {(©/0//©///)+*(r©/,/®/)(2)i2+iAgi)}Ai2, (30) A?! = [AgJ+Aq2—[ AgJ, Aq3 = [Aq3]+Aq2-[Aq2], (31) where rAa I_ (gg*g*0 [Aal- (@@/@//) (32^ Correction of the Fundamental Equation. When we have thus determined, by the numerical solution of our fundamental equation, approximate values of the three positions of the body, it will always be possible to apply a small numerical correction to the equation, so as to make it agree exactly with the laws of elliptic motion in a fictitious case differing but little from the actual. After such a correction the equation will evidently apply to the actual case with a much higher degree of approximation. There is room for great diversity in the application of this principle. The method which appears to the writer the most simple and direct is the following, in which the correction of the intervals for aberration126 DETERMINATION OF ELLIPTIC ORBITS. is combined with the correction required by the approximate nature of the equation* The solution of the fundamental equation gives us three points, which must necessarily lie in one plane with the sun, and in the lines of sight of the several observations. Through these points we may pass an ellipse, and calculate the intervals of time required by the exact laws of elliptic motion for the passage of the body between them. If these calculated intervals should be identical with the given intervals, corrected for aberration, we would evidently have the true solution of the problem. But suppose, to fii our ideas, that the calculated intervals are a little too long. It is evident that if we repeat our calculations, using in our fundamental equation intervals shortened in the same ratio as the calculated intervals have come out too long, the intervals calculated from the second solution of the fundamental equation must agree almost exactly with the desired values. If necessary, this process may be repeated, and thus any required degree of accuracy may be obtained, whenever the solution of the uncorrected equation gives an approximation to the true positions. For this it is necessary that the intervals should not be too great. It appears, however, from the results of the example of Ceres, given hereafter, in which the heliocentric motion exceeds 62° but the calculated values of the intervals of time differ from the given values by little more than one part in two thousand, that we have here not approached the limit of the application of our formula. In the usual terminology of the subject, the fundamental equation with intervals uncorrected for aberration represents the first hypothesis; the same equation with the intervals affected by certain numerical coefficients (differing little from unity) represents the second hypothesis; the third hypothesis, should such be necessary, is represented by a similar equation with corrected coefficients, etc. In the process indicated there are certain economies of labor which should not be left unmentioned, and certain precautions to be observed in order that the neglected figures in our computations may not unduly influence the result. It is evident, in the first place, that for the correction of our fundamental equation we need not trouble ourselves with the position of the orbit in the solar system. The intervals of time, which determine this correction, depend only on the three heliocentric distances r1# r2, r3 and the two heliocentric angles, which will be represented by v2 — vx and v3 — v2, if we write vlt v2, v3 for the true anomalies. These angles (v2 — v1 and v3 — v2) may be determined from rv r2, r3 and nv n2,n3> * When an approximate orbit is known in advance, we may correct the fundamental equation at once. The formulae will be given in the Summary, § xii.DETERMINATION OF ELLIPTIC ORBITS. 127 and therefore from rv r2, rB and the given intervals. For our fundamental equation, which may be written n— n$i2+= 0, (33) indicates that we may form a triangle in which the lengths of the sides shall be nlr1, n2r2, and nBrB (let us say for brevity, sv s2, s3), and the directions of the sides parallel with the three heliocentric directions of the body. The angles opposite sx and s3 will be respectively vB — v2 and v2 — vv We have, therefore, by a well-known formula, tan _ / (•Si--%+s3)(a1+s2-33) ' 2 v(s1+s2+s8)(—«j+s2+s3) /(~Sl + s2 + s3)(gl-g2~s8) ' 2 V (Sj+Sa+SgXsj+Sj-Sa) , As soon, therefore, as the solution of our fundamental equation has given a sufficient* approximation to the values of rv r2> r3 (say five- or six-figure values, if our final result is to be as exact as seven-figure logarithms can make it), we calculate n1, n2, nB with seven-figure logarithms by equations (2), and the heliocentric angles by equations (34). The semi-parameter corresponding to these values of the heliocentric distances and angles is given by the equation nxrx-n2r2+nBrB (35) The expression 7^ —?i2-f?i3, which occurs in the value of the semiparameter, and the expression — or s1-s2-{-sB, which occurs both in the value of the semi-parameter and in the formulae for determining the heliocentric angles, represent small quantities of the second order (if we call the heliocentric angles small quantities of the first order), and cannot be very accurately determined from approximate numerical values of their separate terms. The first of these quantities may, however, be determined accurately by the formula «1-M24-n3 = ^l+^+^3. (36) With respect to the quantity Sj—s2 + s3, a little consideration will show that if we are careful to use the same value wherever the expression occurs, both in the formulae for the heliocentric angles and for the semi-parameter, the inaccuracy of the determination of this value from the cause mentioned will be of no consequence in the process of correcting the fundamental equation. For although the logarithm of Sj — s2+s3 as calculated by seven-figure logarithms from rv r2, r3 may be accurate only to four or five figures, we may regard it as absolutely correct if we make a very small change in the value of one128 DETERMINATION OE ELLIPTIC ORBITS. of the heliocentric distances (say r2). We need not trouble ourselves farther about this change, for it will be of a magnitude which we neglect in computations with seven-figure tables. That the heliocentric angles thus determined may not agree as closely as they might with the positions on the lines of sight determined by the first solution of the fundamental equation is of no especial consequence in the correction of the fundamental equation, which only requires the exact fulfilment of two conditions, viz., that our values of the heliocentric distances and angles shall have the relations required by the fundamental equation to the given intervals of time, and that they shall have the relations required by the exact laws of elliptic motion to the calculated intervals of time. The third condition, that none of these values shall differ too widely from the actual values, is of a looser character. After the determination of the heliocentric angles and the semiparameter, the eccentricity and the true anomalies of the three positions may next be determined, and from these the intervals of time. These processes require no especial notice. The appropriate formulae will be given in the Summary of Formulae. Determination of the Orbit from the Three Positions and the Intervals of Time. The values of the semi-parameter and the heliocentric angles as given in the preceding paragraphs depend upon the quantity — the numerical determination of which from slt s2, and s3 is critical to the second degree when the heliocentric angles are small. This was of no consequence in the process which we have called the correction of the fundamental equation. But for the actual determination of the orbit from the positions given by the corrected equation—or by the uncorrected equation, when we judge that to be sufficient—a more accurate determination of this quantity will generally be necessary, This may be obtained in different ways, of which the following is perhaps the most simple. Let us set @4 = @3-©i> (37) and s4 for the length of the vector 04, obtained by taking the square root of the sum of the squares of the components of the vector. It is evident that s2 is the longer and s4 the shorter diagonal of a parallel gram of which the sides are s1 and s3. The area of the triangle having the sides sv s2, s3 is therefore equal to that of the triangle having the sides sv s3> s4, each being one-half of the parallelogram. This gives ($1+s2 4- $3)( — sl -f s2 -f- s3)(s1 — s2 -f- s3)(s1 + s2 — s3) = Oi“K+s3)( “ si + +s3) (s1 - s4 4- s3)(sx + s4 - s3), (38)DETERMINATION OF ELLIPTIC ORBITS. 129 and 8lmmm82^’83 = (s1 + S4 + -s3) ( — .9X + S4 + ■S8)(.S'I - s4 + S3) (s, + S4 ~ Sa) («1 + S2 + ®s)( - ®1 + s2 + ®s) (»1 + s2 - Ss) (39) The numerical determination of this value of s1 — s2+s3 is critical only to the first degree. The eccentricity and the true anomalies may be determined in the same way as in the correction of the formula. The position of the orbit in space may be derived from the following considerations. The vector — 02 is directed from the sun toward the second position of the body; the vector @4 from the first to the third position. If we set = (40) the vector 05 will be in the plane of the orbit, perpendicular to — fi (components of 3i) = the direction-cosines of the observed position, corrected for the aberration of the fixed stars. ffi1*=zi*+r1*+^1* (<&1$J=x1€i+rln1+zl£1DETERMINATION OF ELLIPTIC ORBITS. 131 Preliminary computations relating to the second and third observations. The formulae are entirely analogous to those relating to the first observation, the quantities being distinguished by the proper suffixes. III. Equations of the first hypothesis. When the preceding quantities have been computed, their numerical values (or their logarithms, when more convenient for computation,) are to be substituted in the following equations : Components of 1 rf For control : F-- . 3E,gi '(l+i^K2 ^2 = />2 + (®2-S2) ri=' a «lì /3' = Atf x+AlnlRl - P'ft j- III' y = Ail+A —r'yi J Components of @2 a2= -&(l-iy(g,+^-(®g.^) &= -^(i-J?2)(?2+|J-("' «'w-^l6+4,fA--Pwa.i r = As%+AsVaRs-F"^\m"' V" = 48fs+- P”'ya j The computer is now to assume any reasonable values either of the geocentric distances, pv p2, p3, or of the heliocentric distances, rly r2, r3 (the former in the case of a comet, the latter in the case of an asteroid), and from these assumed values to compute the rest of the following quantities: By equations IIIj, III'. By equations IILj, III". By equations III3, III'". . «-«!+«,+* C1=-^±M±£iZ £=&+&+& C*----a>a+b^+c*y y=Vi+y2+y3 c8--^a tMlfsy ^2=^2—fsL(Aq^f. (This equation will generally be most easily solved by repeated substitutions.) A^i = C1—ii3rX(Aq2)2 Ag3=Oz—-^uL(Aq2)2. VI. Successive corrections. A unless the assumed values represent a fair approximation. Whether L is also to be recomputed, depends on its magnitude, and on that of the correction of q2> which remains to be made. In the later stages of the work, when the corrections are small, the terms containing L may be neglected altogether. The corrections of qv q2, qz should be repeated until the equations a = 0 ¡3 = 0 y = 0134 DETERMINATION OF ELLIPTIC ORBITS. are nearly satisfied. Approximate values of r1> r2, r3 may suffice for the following computations, which, however, must be made with the greatest exactness. VII. Test of the first hypothesis. l°gri, logr2, l°g r3 (approximate values from the preceding computations). N— A.B^fi+B2r2z+A3B3r3z si = Alrl+A1Birl 2 s2 = r2-B2r22 s3 — A 3r3+A3B3r32 s — i(siA-s2-\-s3) S ■— Si, S ““ So, S — s The value of s —s2 may be very small, and its logarithm in consequence ill determined. This will do no harm if the computer is careful to use the same value—computed, of course, as carefully as possible—wherever the expression occurs in the following formulae: ‘-Vs -sjjs-s^is-ss) p= 2(s-s2) M tanH^-«i)=rrr s —s3 o Oi tan %(vs—Wj) = —^-2 For adjustment of values : \(vz — vf) = J(v2—-f1 (vz—v2) P_P e sin 1(^3+^) = ri ^3 2smb(vz-v1) e cos 4-^) = tani^g-f^) P+P-2 ri 2 COS ¿(^3 — ^) For control: P e cos v9~ — 2 ^2 1 a = P 1—e2 tan \EX = e tan tan \E2 — e tan \v2 tan \EZ = e tan \vz rx calc> = ofi(Ez — E2) + ea% sin E2 — ec$ sin Ez r3 calc. = a^(i?2 — jE^)-fea% sin Ex — ea% sin i?2DETERMINATION OF ELLIPTIC ORBITS. 135 VIII. For the second hypothesis. $tx — *0057613k(p2 — ps) (aberration-constant after Struve.) Sts = *0057613Jc(Pl - p2) log (*0057613&) = 5*99610 A log rx = log TX - log (Tl calc - Stx) A log T3 = log T3 - log (t8 calc. - St2) A log (Tit8) = A log Tl + a log Tg A log ^ = A log rx — A log r3 T3 A log Ax = — A3 A log — T3 A log As — log — T3 A log B, = A log (TlT3) - A log AlogJB2=Alog(TlT8)+^^ Alog^ A log 5a = A log (Tjt3)+A log ^ These corrections are to be added to the logarithms of A1} A3, Bx, B2, Bb> in equations IIIl5 III2, III3, and the corrected equations used to correct the values of qly q2, qB, until the residuals a, fi, y vanish. The new values of Av A3 must satisfy the relation A1+As = l, and the corrections Alog^j, AlogA3 must be adjusted, if necessary, for this end. Third hypothesis. A second correction of equations IIIj, III2, IH3 may be obtained in the same manner as the first, but this will rarely be necessary. IX. Determination of the ellipse. It is supposed that the values of aV i®l> 7l> a2’ j®2> V2> a3> i®3» Ys> ^*2» ^*3> -^1> B>2> Bb, Sj, S2y Sg, have been computed by equations III^ III2, IH3 with the greatest exactness, so as to make the residuals a, /3, y vanish, and that the two formulae for each of the quantities sX)s2, sB give sensibly the same value.136 DETERMINATION OF ELLIPTIC ORBITS. Components of ©4 a4=a3-ai ßi = ß3 ~~ ßl 74 = 73-71 Components of ©5 a4a2+/8A+y4y2 a5 — a4 — q 2 a2 S2 O _/D a4«2+ ^2 + 7472 /D P5 — P4 ¡T2 P2 S2 .. a4a2 + /34/32 + y4y2 ys-y4-----------2 y2 s42=«42+/342+y42 852=a62+/362+y62 >^a*+ß*-S = KSl + S2 + S8) ^ = Ksi + s4 + s8) For control only: SjS-sMS-sJjS-s,) 2 s) tanK^-%)=^ 1^= J.1jR1 + i^2 + -4.3i2s 2r2s tan £(«3-%>=%_< ^=A7 R$ -Z\T(s—sx)(s—s3) tan H^3 — ^1) ~ /o _ O.V.2 — Sl)(S-S3) The computer should be careful to use the corrected values of A1} Az. (See VIII.) Trifling errors in the angles should be distributed. r, To esin^Vs+vù=¥dnKVi_Vi) r+r~2 e*osHvi+v1)=^a^_vj tan K^+^i) e~f(21 AIogT3 = ifyf~f(u where T(l)i T{2), Tm denote respectively the values obtained from the first, second, and third observations, and M the modulus of common logarithms. For an ephemeris. ~(t-T) = E-esmE a- Heliocentric coordinates. (Components of ) x — — eax4-axcosE-\-bx sin E y— — eay+ay cos E-\-by sin E z — — eaz+az cos E+ bz sin E These equations are completely controlled by the agreement of the computed and observed positions and the following relations between the constants : axbx+ayby+azbz = 0 ax2 + ay2+az2 = a2 b2 + b2 + bz2 = (1 - e2)a2138 DETERMINATION OF ELLIPTIC ORBITS. XII When an approximate orbit is known in advance, we may use it to improve our fundamental equation. The following appears to be the most simple method: Find the excentric anomalies E1} E2, E3, and the heliocentric distances rv r2, r3, wrhich belong in the approximate orbit to the times of observation corrected for aberration. Calculate B1} Bs, as in §1, using these corrected times. Determine Av A3 by the equation sin (E3 — E2) — esmE3 + e sin E2 sin (E2 — Et) — esmE2 + e sin Ex in connection with the relation A1 + A3~ 1. Determine B2 so as to make 4 sin J (E2 — E-l) sin \ (E3 — E2) sin \ (.E3 — E1) equal to either member of the last equation. It is not necessary that the times for which Ev E2> E3) rlt r2, r3, are calculated should precisely agree with the times of observation corrected for aberration. Let the former be represented by t2, t3, and the latter by t{', t2", t3"; and let We may find JB1} B3, Ax> A3) B2, as above, using t2, t3, and then use AlogTj, Alogr3 to correct their values, as in §VIII. Numerical Example. To illustrate the numerical computations we have chosen the following example, both on account of the large heliocentric motion, and because Gauss and Oppolzer have treated the same data by their different methods. The data are taken from the Theoria Motus, § 159, viz., Times, 1805, September o *51336 139*42711 265-39813 Longitudes of Ceres - 95° 32' 18"‘56 99° 49' 5"‘87 118° 5' 28"-85 Latitudes of Ceres -0° 59' 34"‘06 + 7° 16' 36"‘80 + 7° 38' 49"-39 Longitudes of the Earth 342° 54' 56"‘00 117° 12' 43"‘25 241° 58' 50"-71 Logs of the Sun’s distance - 0*0031514 ! 9-9929861 0-0056974 The positions of Ceres have been freed from the effects of parallax and aberration. A log T, = log(f3"- tt") - logos'- i2'), A log TS=log(i2" - i/') - log(i2' -1{).DETERMINATION OF ELLIPTIC ORBITS. 139 I. From the given times we obtain the following values: Numbers. Logarithms. ¿2 - tx 133*91375 2*1268252 h~h 125*97102 2*1002706 h ~ h 259*88477 2*4147809 A *4847187 9*6854897 A *5152812 9*7120443 ri *3358520 r3 *3624066 B! 9*6692113 B, *3183722 b3 9*5623916 Control: AXBX+B2 + AbBb = 2-4959086 1x^3 = 2-4959081 II. From the given positions we get: logX, 9*9835515 + i log X2 9*6531725 _ log Xo 9*6775810 _ log^i 9-4711748 - i log Y2 9*9420444 + log r3 9*9515547 - -Zi 0 A 0 A 0 logii 8*9845270 - log £2 9-2282738 - log ^3 9*6690294 - log rn 9*9979027 + log V2 9*9900800 + log Vs 9*9416855 + Iogf) 8*2387150 - logf2 9*1026549 + logL 9*1240813 + *3874081 - @2*?2 *9314223 + *5599304 - Pi *8645336 + P-2 *1006681 + Ps *7130624 + III. The preceding computations furnish the numerical values for the equations IIIj, III', III2, III", III3, ill'", which follow. Brackets indicate that logarithms have been substituted for numbers. We have now to assume some values for the heliocentric distances ri> r2> r3- -A- mean proportional between the mean distances of Mars and Jupiter from the Sun suggests itself as a reasonable assumption. In order, however, to test the convergence of the computations, when the assumptions are not happy, we will make the much less probable assumption (actually much farther from the truth) that the heliocentric distances are an arithmetical mean between the distances of Mars and Jupiter. This gives *526 for the logarithm of each of the distances rl9 r2, rs. From these assumed values we compute the first column of numbers in the three following tables :140 DETERMINATION OF ELLIPTIC ORBITS. 5'l = p1 —'S874081 «1= -[8-6700167](21-9'5901555)(l+Ä1)'j ris=2i +-8645336 &= [9-6833924] (ft+ ■0900552)(l+JR1)ilII1 Äl=[9-6692113]r1-3 y1=-[7-9242047] (ft+ -3874081X1+ ^i)J a'=--046775-[8-67002]ii1-P'a11 /3'= -482383+[9-68339]P1-P'/31 J-IH' ( ’ l Tl y'= --008399-[7-92420]P1-P/y1J A^i - -66731 - -04558 - -0010434 +-0000006 CO»O Ml> -------------- . _ _____ '(NM(M^(NHO>OCOH050COI>iO(N001NI>i^ ^lOlOTtiiO'^T^r-HTt<(M'^'-HOi»OOOCOOOCOCCr-lr—iO ' ‘ 'OODCS^^Cit-CMr^OSOOCO'-iCOr-iCOC® --------- i>®,-«i>>*TtlVOVOCSeO>—i sr^®oocsc®vocoTt■ CO ® r-< CO ~ “ — - —-----} I-, ^ S 00 VO 00 05 +JI Oi t"» CM IT-» OS 00 CO rH

    00 sss jllssis ® CM oo >o co vo co ® R cb oo oo o CM co VO CS ^ r-H R CO CO —I ^ VO CO VO 00 o o o ® VO OS CO GO VO r—1 VO ■I co r+ Oj OS 2 N ® OO CO « ! 3 ^ CO CM VO 00 ® ,v io IO ffl h o j o 00 00 M Ti O Ö N W H O ■H lO H l>l> 5 OS 05 H cb co ci CO T* r- t— OS CO + + +I++I I + I+ +I I + I I I I + + + + ?s $3 5> 0 Ö Ö 3 3 3 fCEClC^ H|cgHfNH0 cm ^ oo t-> t^co co oo r-i 05 co oo ® os !>>>-( »o ®oo®r-co®rt i>* oo j>®®as®©©cs®®ososooQoas © © VO CO H c CM CO rt< r-HV0 1>V r-l ® CO C 00 CO CM Cl nH OS CO ^ CO CO CM c I ® VO ^ 00 OS co ^ ) r-i co r-~ ® vo cm oo > ® cs oo cs co rti I VO ® I-H CS CO VO 1 —‘ ■“ r* I''. |> •< ■■ i Tf< r H r3> I should be inclined to take the components in (7) in the direction of one of the coordinate axes, choosing that one which is most nearly directed towards the third observed position. However, I will write A(l+§)[(«i-¥) + ft(air¥)]-(l-§)[( ^eorems Olbers and Lambert. It is evident that in general the error in I is of the fifth order, in Ila, 115, lie of the fourth, and in III of the third. But for equal intervals, the error in I is of the sixth order, and in III of the fourth. And when t224-t32 — 3t12 = 0, IIP2>Ps> ^or coefficients of 9^, 9?2, 9t3 in I, and % for the error of the equation, we have exactly Pi% ~P2%2+Ps% = Z> which gives pt 9^ x 9fts —p2 9^2 X = £ X 9£3, 9fr2x9ft3 _Pi £x9?g ^1X^3-^ p2WiX0V Now ^ is my expression for the ratio of the triangles, and —^ is its error. This is of the fourth order in general (since the denominator is of the first), and for equal intervals, of the fifth. The same is true of the two other ratios. Thus we have ^1X^2 _ff3 3tiX£ «.xHTft 2>2^ix5R3‘ Adding these equations and subtracting 1 [from both sides] we have m.xm, ~ p2 Here the last term, which represents the error, is of the fifth order in general, or for equal intervals, of the sixth. But the154 VECTOR METHOD IN THE DETERMINATION OF ORBITS. quantity sought is of the second order, and the relative error is of the third order in the general case, or the fourth for equal intervals. It is precisely this error which is most important in the case of elliptic orbits. It will be observed that the accuracy of the expressions for the ratios [rxr2] : [r2r3]: [r2r3] affords no measure of the accuracy of the formula for the determination of elliptic orbits. I think that this hasty sketch will illustrate the convenience and perspicuity of vector notations in this subject, quite independently of any particular method which is chosen for the determination of the orbit. What is the best method ? is hardly, I think, a question which admits of a definite reply. It certainly depends upon the ratio of the time intervals, their absolute value, and many other things. Yours very truly, J. Willard Gibbs. P.8.—If we wish to use the curtate distances, with reference to the ecliptic or the equator, let px be defined as the distance multiplied by cosine (lat. or dec.), and 3i as a vector of length secant (lat. or dec.). For the most part the formulae will require no change, but the square of 3i will be sec2(lat. or dec.) instead of unity, so that the last terms of (8) will have this factor. (3^$2^3) will then Gauss’ (O.I.2.), whereas in my paper (3i32$3) is Lagrange’s (C' G" C'"). J. W. G.VIL ON THE RÔLE OF QUATERNIONS IN THE.ALGEBRA OF VECTORS. [Nature, vol. xliii. pp. 511-513, April 2, 1891.] The following passage, which has recently come to my notice, in the preface to the third edition of Prof. Tait’s Quaternions seems to call for some reply : “ Even Prof. Willard Gibbs must be ranked as one of the retarders of quaternion progress, in virtue of his pamphlet on Vector Analysis, a sort of hermaphrodite monster, compounded of the notations of Hamilton and of Grassmann.” The merits or demerits of a pamphlet printed for private distribution a good many years ago do not constitute a subject of any great importance, but the assumptions implied in the sentence quoted are suggestive of certain reflections and inquiries which are of broader interest, and seem not untimely at a period when the methods and results of the various forms of multiple algebra are attracting so much attention. It seems to be assumed that a departure from quaternionic usage in the treatment of vectors is an enormity. If this assumption is true, it is an important truth ; if not, it would be unfortunate if it should remain unchallenged, especially when supported by so high an authority. The criticism relates particularly to notations, but I believe that there is a deeper question of notions underlying that of notations. Indeed, if my offence had been solely in the matter of notation, it would have been less accurate to describe my production as a monstrosity, th#,n to characterize its dress as uncouth. Now what are the fundamental notions which are germane to a vector analysis? (A vector analysis is of course an algebra for vectors, or something which shall be to vectors what ordinary algebra is to ordinary quantities.) If we pass over those notions which are so simple that they go without saying, geometrical addition (denoted by +) is, perhaps, first to be mentioned. Then comes the product of the lengths of two vectors and the cosine of the angle which they include. This, taken negatively, is denoted in quaternions by Saß, where a and ß are the vectors. Equally important is à vector at right angles to a and ß (.on a specified side of their plane), and156 QUATERNIONS IN THE ALGEBRA OF VECTORS. representing in length the product of their lengths and the sine of the angle which they include. This is denoted by Ya/3 in quaternions. How these notions are represented in my pamphlet is a question of very subordinate consequence, which need not be considered at present. The importance of these notions, and the importance of a suitable notation for them, is not, I suppose, a matter on which there is any difference of opinion. Another function of a and /}, called their product and written a/3, is used in quaternions. In the general case, this is neither a vector, like Va/3, nor a scalar (or ordinary algebraic quantity), like Sa/3, but a quaternion—that is, it is part vector and part scalar. It may be defined by the equation— a/3 = Va/3 + Sa/3. The question arises, whether the quaternionic product can claim a prominent and fundamental place in a system of vector analysis. It certainly does not hold any such place among the fundamental geometrical conceptions as the geometrical sum, the scalar product, or the vector product. The geometrical sum a+/3 represents the third side of a triangle as determined by the sides a and /3. Va/3 represents in magnitude the area of the parallelogram determined by the sides a and /3, and in direction the normal to the plane of the parallelogram. SyVa/3 represents the volume of the parallelopiped determined by the edges a, /3, and y. These conceptions are the very foundations of geometry. We may arrive at the same conclusion from a somewhat narrower but very practical point of view. It will hardly be denied that sines and cosines play the leading parts in trigonometry. Now the notations Ya/3 and Safi represent the sine and the cosine of the angle included between a and fi} combined in each case with certain other simple notions. But the sine and cosine combined with these auxiliary notions are incomparably more amenable to analytical transformation than the simple sine and cosine of trigonometry, exactly as numerical quantities combined (as in algebra) with the notion of positive or negative quality are incomparably more amenable to analytical transformation than the simple numerical quantities of arithmetic. I do not know of anything which can be urged in favor of the quaternionic product of two vectors as a fundamental notion in vector analysis, which does not appear trivial or artificial in comparison with the above considerations. The same is true of the quaternionic quotient, and of the quaternion in general. How much more deeply rooted in the nature of things are the functions Sa/3 and Ya/3 than any which depend on the definition of a quaternion, will appear in a strong light if we try to extendQUATERNIONS IN THE ALGEBRA OF VECTORS. 157 our formulae to space of four or more dimensions. It will not be claimed that the notions of quaternions will apply to such a space, except indeed in such a limited and artificial manner as to rob them of their value as a system of geometrical algebra. But vectors exist in such a space, and there must be a vector analysis for such a space. The notions of geometrical addition -and the scalar product are evidently applicable to such a space. As we cannot define the direction of a vector in space of four or more dimensions by the condition of perpendicularity to two given vectors, the definition of Va/3, as given above, will not apply totidem verbis to space of four or more dimensions. But a little change in the definition, which would make no essential difference in three dimensions, would enable us to apply the idea at once to space of any number of dimensions. These considerations are of a somewhat a priori nature. It may be more convincing to consider the use actually made of the quaternion as an instrument for the expression of spatial relations. The principal use seems to be the derivation of the functions expressed by Sa/3 and Ya/3. Each of these expressions is regarded by quaternionic writers as representing two distinct operations; first, the formation of the product a/3, which is the quaternion, and then the taking out of this quaternion the scalar or the vector part, as the case may be, this second process being represented by the selective symbol, S or V. This is, I suppose, the natural development of the subject in a treatise on quaternions, where the chosen subject seems to require that we should commence with the idea of a quaternion, or get there as soon as possible, and then develop everything from that particular point of view. In a system of vector analysis, in which the principle of development is not thus predetermined, it seems to me contrary to good method that the more simple and elementary notions should be defined by means of those which are less so. The quaternion affords a convenient notation for rotations. The notation q( )q~1, where q is a quaternion and the operand is to be written in the parenthesis, produces on all possible vectors just such changes as a (finite) rotation of a solid body. Rotations may also be represented, in a manner which seems to leave nothing to be desired, by linear vector functions. Doubtless each method has advantages in certain cases, or for certain purposes. But since nothing is more simple than the definition of a linear vector function, while the definition of a quaternion is far from simple, and since in any case linear vector functions must be treated in a system of vector analysis, capacity for representing rotations does not seem to me sufficient to entitle the quaternion to a place among the fundamental and necessary notions of a vector analysis. Another use of the quaternionic idea is associated with the symbol V.158 QUATERNIONS IN THE ALGEBRA OF VECTORS. The quantities written SVco and Wco, where oo denotes a vector having values which vary in space, are of fundamental importance in physics. In quaternions these are derived from the quaternion Vw by selecting respectively the scalar or the vector part. But the most simple and elementary definitions of SVw and Wco are quite independent of the conception of a quaternion, and the quaternion Vco is scarcely used except in combination with the symbols S and Y, expressed or implied. There are a few formulae in which there is a trifling gain in compactness in the use of the quaternion, but the gain is very trifling so far as I have observed, and generally, it seems to me, at the expense of perspicuity. These considerations are sufficient, I think, to show that the position of the quaternionist is not the only one from which the subject of vector analysis may be viewed, and that a method which would be monstrous from one point of view, may be normal and inevitable from another. Let us now pass to the subject of notations. I do not know wherein the notations of my pamphlet have any special resemblance to Grassmanns, although the point of view from which the pamphlet was written is certainly much nearer to his than to Hamilton’s. But this a matter of minor consequence. It is more important to ask, What are the requisites of a good notation for the purposes of vector analysis ? There is no difference of opinion about the representation of geometrical addition. When we come to functions having an analogy to multiplication, the products of the lengths of two vectors and the cosine of the angle which they include, from any point of view except that of the quaternionist, seems more simple than the same quantity taken negatively. Therefore we want a notation for what is expressed by — Saß, rather than Baß, in quaternions. Shall the symbol denoting this function be a letter or some other sign? and shall it precede the vectors or be placed between them ? A little reflection will show, I think, that while we must often have recourse to letters to supplement the number of signs available for the expression of all kinds of operations, it is better that the symbols expressing the most fundamental and frequently recurring operations should not be letters, and that a sign between the vectors, and, as it were, uniting them, is better than a sign before them in a case having a formal analogy with multiplication. The case may be compared with that of addition, for which a-f ß is evidently more convenient than 2(a, ß) or 'Laß would be. Similar considerations will apply to the function written in quaternions Vaß. It would seem that we obtain the ne 'plus ultra, of simplicity and convenience, if we express the two functions by uniting the vectors in each case with a sign suggestive of multiplication. The particular forms of the signs whichQUATERNIONS IN THE ALGEBRA OF VECTORS. 159 we adopt is a matter of minor consequence. In order to keep within the resources of an ordinary printing office, I have used a dot and a cross, which are already associated with multiplication, but are not needed for ordinary multiplication, which is best denoted by the simple juxtaposition of the factors. I have no especial predilection for these particular signs. The use of the dot is indeed liable to the objection that it interferes with its use as a separatrix, or instead of a parenthesis. If, then, I have written a.ß and axß for what is expressed in quaternions by —Saß and Yaß, and in like manner V.&> and Vx« for — SVeo and YVoo in quaternions, it is. because the natural development of a vector analysis seemed to lead logically to some such notations. But I think that I can show that these notations have some substantial advantages over the quaternionic in point of convenience. Any linear vector function of a variable vector p may be expressed in the form— aA. p+ßjUL • p 4- y v. p = (aA -f ßju. + y v). p = $. p, where $ = aX + ß/* + yv; or in quaternions — aSA p — ßSfxp — ySrp = — (aSA + ß&fJL + y Sr) p = — p, where ^> = aSA + /3S/x-i-'ySr. If we take the scalar product of the vector 3>.p, and another vector = o’. (aA+/3/x + yv) • p> or in quaternions So-0/o = So’(aSA + i8S/>t-|-ySr)/o. This is a function of ar and of p, and it is exactly the same kind of function of cr that it is of p, a symmetry which is not so clearly exhibited in the quaternionic notation as in the other. Moreover, we can write for or.{a\ + ß/m + yv). This represents a vector which is a function of or, viz., the function conjugate to 3?.<7; and o-.Q.p may be regarded as the product of this vector and p. This is not so clearly indicated in the quaternionic notation, where it would be straining things a little to call So-

    (oX): (ftp) — (a. /3)(A. /*)• With these definitions will be the determinant of <3?, and T»x^ will be the conjugate of the reciprocal of d? multiplied by twice the determinant. If $ represents the manner in which vectors are affected by a strain, will represent the manner in which surfaces are affected, and the manner in which volumes are affected. Considerations of this kind do not attach themselves so naturally to the notation <£ = aSX-f /3S/z + ySr, nor does the subject admit so free a development with this notation, principally because the symbol S refers to a special use of the matrix, and is very much in the way when we want to apply the matrix to other uses, or to subject it to various operations.VIII. QUATERNIONS AND THE AUSDEHNUNGSLEHRE. [Nature, vol. xliv. pp. 79-82, May 28, 1891.] The year 1844 is memorable in the annals of mathematics on account of the first appearance on the printed page of Hamilton’s Quaternions and Grassmann’s Ausdehnungslehre. The former appeared in the July, October, and supplementary numbers of the Philosophical Magazine, after a previous communication to the Royal Irish Academy, November 13, 1843. This communication was indeed announced to the Council of the Academy four weeks earlier, on the very day of Hamilton’s discovery of quaternions, as we learn from one of his letters. The author of the Ausdehnungslehre, although not unconscious of the value of his ideas, seems to have been in no haste to place himself on record, and published nothing until he was able to give the world the most characteristic and fundamental part of his system with considerable development in a treatise of more than 300 pages, which appeared in August 1844. The doctrine of quaternions has won a conspicuous place among the various branches of mathematics, but the nature and scope of the Ausdehnungslehre, and its relation to quaternions, seem to be still the subject of serious misapprehension in quarters where we naturally look for accurate information. Historical justice, and the interests of mathematical science, seem to require that the allusions to the Ausdehnungslehre in the article on “ Quaternions ” in the last edition of the Encyclopaedia Britannica, and in the third edition of Prof. Tait’s Treatise on Quaternions, should not be allowed to pass without protest. It is principally as systems of geometrical algebra that quaternions and the Ausdehnungslehre come into comparison. To appreciate the relations of the two systems, I do not see how we can proceed better than if we ask first what they have in common, then what either system possesses which is peculiar to itself. The relative extent and importance of the three fields, that which is common to the two systems, and those which are peculiar to each, will determine the relative rank of the geometrical algebras. Questions of priority can only relate to the field common to both, and will be much simplified by having the limits of that field clearly drawn.162 QUATEKNIONS AND THE A USDEHNUNGSLEHRE. Geometrical addition in three dimensions is common to the two systems, and seems to have been discovered independently both by Hamilton and Grassmann, as well as by several other persons about the same time. It is not probable that any especial claim for priority with respect to this principle will be urged for either of the two with which we are now concerned. The functions of two vectors which are represented in quaternions by Safi and Vafi are common to both systems as published in 1844, but the quaternion is peculiar to Hamilton’s. The linear vector function is common to both systems as ultimately developed, although mentioned only by Grassmann as early as 1844. To those already acquainted with quaternions, the first question will naturally be: To what extent are the geometrical methods which are usually called quaternionic peculiar to Hamilton, and to what extent are they common to Grassmann ? This is a question which anyone can easily decide for himself. It is only necessary to run one’s eye over the equations used by quaternionic writers in the discussion of geometrical or physical subjects, and see how far they necessarily involve the idea of the quaternion, and how far they would be intelligible to one understanding the functions Safi and Y afi, but having no conception of the quaternion afi, or at least could be made so by trifling changes of notation, as by writing S or Y in places where they would not affect the value of the expressions. For such a test the examples and illustrations in treatises on quaternions would be manifestly inappropriate, so far as they are chosen to illustrate quaternionic principles, since the object may influence the form of presentation. But we may use any discussion of geometrical or physical subjects, where the writer is free to choose the form most suitable to the subject. I myself have used the chapters and sections in Prof. Tait’s Quaternions on the following subjects: Geometry of the straight line and plane, the sphere and cyclic cone, surfaces of the second degree, geometry of curves and surfaces, kinematics, statics and kinetics of a rigid system, special kinetic problems, geometrical and physical optics, electrodynamics, general expressions for the action between linear elements, application of V to certain physical analogies, pp. 160-371, except the examples (not worked out) at the close of the chapters. Such an examination will show that for the most part the methods of representing spatial relations used by quaternionic writers are common to the systems of Hamilton and Grassmann. To an extent comparatively limited, cases will be found in which the quaternionic idea forms an essential element in the signification of the equations. The question will then arise with respect to the comparatively limited field which is the peculiar property of Hamilton, How im-QUATEKNIONS AND THE A USDEHN UNGSLE3RE. 168 portant are the advantages to be gained by the use of the quaternion ? This question, unlike the preceding, is one into which a personal equation will necessarily enter. Everyone will naturally prefer the methods with which he is most familiar; but I think that it may be safely affirmed that in the majority of cases in this field the advantage derived from the use of the quaternion is either doubtful or very trifling. There remains a residuum of cases in which a substantial advantage is gained by the use of the quaternionic method. Such cases, however, so far as my own observation and experience extend, are very exceptional. If a more extended and careful inquiry should show that they are ten times as numerous as I have found them, they would still be exceptional. We have now to inquire what we find in the A usdehnungslehre in the way of a geometrical algebra, that is wanting in quaternions. In addition to an algebra of vectors, the A usdehnungslehre affords a system of geometrical algebra in which the point is the fundamental element, and which for convenience I shall call Grassmann’s algebra of points. In this algebra we have first the addition of points, or quantities located at points, which may be explained as follows. The equation «A + &B+eC + etc. = eE+/F+ etc., in which the capitals denote points, and the small letters scalars (or ordinary algebraic quantities), signifies that C& + 6 + C+ etc. = 6+/+ etc., and also that the centre of gravity of the weights a, b, c, etc., at the points A, B, C, etc., is the same as that of the weights e, /, etc., at the points E, F, etc. (It will be understood that negative weights are allowed as well as positive.) The equation is thus equivalent to four equations of ordinary algebra. In this Grassmann was anticipated by Mobius (Barycentrischer Galcul, 1827). We have next the addition of finite straight lines, or quantities located in straight lines (Liniengrossen). The meaning of the equation AB + CD+ etc. = EF+GH + etc. will perhaps be understood most readily, if we suppose that each member represents a system of forces acting on a rigid body. The equation then signifies that the two systems are equivalent. An equation of this form is therefore equivalent to six ordinary equations. It will be observed that the Liniengrossen AB and CD are not simply vectors; they have not merely length and direction, but they are also located each in a given line, although their position within those lines is immaterial. In Clifford's terminology, AB is a rotor, AB + CD a motor. In the language of Prof. Ball's Theory of Screws, AB + CD represents either a twist or a wrench.164 QUATERNIONS AND THE AUSDEHNUNGSLEHRE. We have next the addition of plane surfaces (Plangrossen). The equation ABO+DEF GHI=JKL signifies that the plane JKL passes through the point common to the planes ABC, DEF, and GHI, and that the projection by parallel lines of the triangle JKL on any plane is equal to the sum of the projections of ABC, DEF, and GHI on the same plane, the areas being taken positively or negatively according to the cyclic order of the projected points. This makes the equation equivalent to four ordinary equations. Finally, we have the addition of volumes, as in the equation ABCD+EFGH = I JKL, where there is nothing peculiar, except that each term represents the six-fold volume of the tetrahedron, and is to be taken positively or negatively according to the relative position of the points. We have also multiplications as follows: The line (Liniengrosse) AB is regarded as the product of the points A and B. The Plangrosse ABC, which represents the double area of the triangle, is regarded as the product of the three points A, B, and C, or as the product of the line AB and the point C, or of BC and A, or indeed of BA and C. The volume ABCD, which represents six times the tetrahedron, is regarded as the product of the points A, B, C, and D, or as the product of the point A and the Plangrosse BCD, or as the product of the lines AB and BC, etc., etc. This does not exhaust the wealth of multiplicative relations which Grassmann has found in the very elements of geometry. The following products are called regressive, as distinguished from the progressive, which have been described. The product of the Plan-grossen ABC and DEF is a part of the line in which the planes ABC and DEF intersect, which is equal in numerical value to the product of the double areas of the triangles ABC and DEF multiplied by the sine Oi: the angle made by the planes. The product of the Linien-grosse AB and the Plangrosse CDE is the point of intersection of the line and the plane with a numerical coefficient representing the product of the length of the line and the double area of the triangle multiplied by the sine of the angle made by the line and the plane The product of three Plangrossen is consequently the point common to the three planes with a certain numerical coefficient. In plane geometry we have a regressive product of two Liniengrossen, which gives the point of intersection of the lines with a certain numerical coefficient. The fundamental operations relating to the point, line, and plane are thus translated into analysis by multiplications. The immense flexibility and power of such an analysis will be appreciated byQUATERNIONS AND THE A USDEHNUNGSLEHRE. 165 anyone who considers what generalized multiplication in connection with additive relations has done in other fields, as in quaternions, or in the theory of matrices, or in the algebra of logic. For a single example, if we multiply the equation AB -f- OD -j- etc. = EF -f- GH -4“ etc. by PQ (P and Q being any two points), we have ABPQ + CDPQ+etc. = EFPQ + GHPQ + etc., which will be recognised as expressing an important theorem of statics. The field in which Grassmann’s algebra of points, as distinguished from his algebra of vectors, finds its especial application and utility is nearly coincident with that in which, when we use the methods of ordinary algebra, tetrahedral or anharmonic coordinates are more appropriate than rectilinear. In fact, Grassmann’s algebra of points may be regarded as the application of the methods of multiple algebra to the notions connected with tetrahedral coordinates, just as his or Hamilton’s algebra of vectors may be regarded as the application of the methods of multiple algebra to the notions connected with rectilinear coordinates. These methods, however, enrich the field to which they are applied with new notions. Thus the notion of the coordinates of a line in space, subsequently introduced by Plticker, was first given in the Ausdehnungslehre of 1844. It should also be observed that the utility of a multiple algebra when it takes the place of an ordinary algebra of four coordinates, is very much greater than when it takes the place of three coordinates, for the same reason that a multiple algebra taking the place of three coordinates is very much more useful than one taking the place of two. Grassmann’s algebra of points will always command the admiration of geometers and analysts, and furnishes an instrument of marvellous power to the former, and in its general form, as applicable to space of any number of dimensions, to the latter. To the physicist an algebra of points is by no means so indispensable an instrument as an algebra of vectors. Grassmann’s algebra of vectors, which we have described as coincident with a part of Hamilton’s system, is not really anything separate from his algebra of points, but constitutes a part of it, the vector arising when one point is subtracted from another. Yet it constitutes a whole, complete in itself, and we may separate it from the larger system to facilitate comparison with the methods of Hamilton. We have, then, as geometrical algebras published in 1844, an algebra of vectors common to Hamilton and Grassmann, augmented on Hamilton’s side by the quaternion, and on Grassmann’s by his algebra of points. This statement should be made with the166 QUATERNIONS AND THE AUSDEHNUNGSLEHRE. reservation that the addition both of vectors and of points had been given by earlier writers. In both systems as finally developed we have the linear vector function, the theory of which is identical with that of strains and rotations. In Hamilton’s system we have also the linear quaternion function, and in Grassmann’s the linear function applied to the quantities of his algebra of points. This application gives those transformations in which projective properties are preserved, the doctrine of reciprocal figures or principle of duality, etc. (Grassmann’s theory of the linear function is, indeed, broader than this, being coextensive with the theory of matrices; but we are here considering only the geometrical side of the theory.) In his earliest writings on quaternions, Hamilton does not discuss the linear function. In his Lectures on Quaternions (1853), he treats of the inversion of the linear vector function, as also of the linear quaternion function, and shows how to find the latent roots of the vector function, with the corresponding axes for the case of real and unequal roots. He also gives a remarkable equation, the symbolic cubic, which the functional symbol must satisfy. This equation is a particular case of that which is given in Prof. Cayley’s classical Memoir on the Theory of Matrices (1858), and which is called by Prof. Sylvester the Hamilton-Cayley equation. In his Elements of Quaternions (1866), Hamilton extends the symbolic equation to the quaternion function. In Grassmann, although the linear function is mentioned in the first Ausdehnungslehre, we do not find so full a discussion of the subject until the second Ausdehnungslehre (1862), where he discusses the latent roots and axes, or what corresponds to axes in the general theory, the whole discussion relating to matrices of any order. The more difficult cases are included, as that of a strain in which all the roots are real, but there is only one axis or unchanged direction. On the formal side he shows how a linear function may be represented by a quotient or sum of quotients, and by a sum of products, LücJcenausdruch. More important, perhaps, than the question when this or that theorem was first published is the question where we first find those notions and notations which give the key to the algebra of linear functions, or the algebra of matrices, as it is now generally called. In vol. xxxi, p. 35, of Nature, Prof. Sylvester speaks of Cayley’s “ever-memorable” Memoir on Matrices as constituting “a second birth of Algebra, its avatar in a new and glorified form,” and refers to a passage in his Lectures on Universal Algebra, from which, I think, we are justified in inferring that this characterization of the memoir is largely due to the fact that it is there shown how matricesQUATERNIONS AND THE A USDEHNUNGSLEHRE. 167 may be treated as extensive quantities, capable of addition as well as of multiplication. This idea, however, is older than the memoir of 1858. The Luckenausdruck, by which the matrix is expressed as a sum of a kind of products (luckenhaltig, or open), is described in a note at the end of the first Ausdehnungslehre. There we have the matrix given not only as a sum, but as a sum of products, introducing a multiplicative relation entirely different from the ordinary multiplication of matrices, and hardly less fruitful, but not lying nearly so near the surface as the relations to which Prof. Sylvester refers. The key to the theory of matrices is certainly given in the first Ausdehnungslehre, and if we call the birth of matricular analysis the second birth of algebra, we can give no later date to this event than the memorable year of 1844. The immediate occasion of this communication is the following passage in the preface to the third edition of Prof. Tait’s Quaternions: “ Hamilton not only published his theory complete, the year before the first (and extremely imperfect) sketch of the Ausdehnungslehre appeared; but had given ten years before, in his protracted study of Sets, the very processes of external and internal multiplication (corresponding to the Vector and Scalar parts of a product of two vectors) which have been put forward as specially the property of Grassmann.” For additional information we are referred to art. “Quaternions,” Encyc. Brit., where we read respecting the first Ausdehnungslehre: “ In particular two species of multiplication (‘ inner ’ and ‘ outer ’) of directed lines in one plane were given. The results of these two kinds of multiplication correspond respectively to the numerical and the directed parts of Hamilton’s quaternion product. But Grassmann distinctly states in his preface that he had not had leisure to extend his method to angles in space. . . . But his claims, however great they may be, can in no way conflict with those of Hamilton, whose mode of multiplying couples (in which the 4 inner ’ and ‘ outer ’ multiplication are essentially involved) was produced in 1833, and whose quaternion system was completed and published before Grassmann had elaborated for press even the rudimentary portions of his own system, in which the veritable difficulty of the whole subject, the application to angles in space, had not even been attacked.” I shall leave the reader to judge of the accuracy of the general terms used in these passages in comparing the first Ausdehnungslehre with Hamilton’s system as published in 1843 or 1844. The specific statements respecting Hamilton and Grassmann require an answer. It must be Hamilton’s Theory of Conjugate Functions or Algebraic Couples (read to the Royal Irish Academy, 1833 and 1835, and published in vol. xvii of the Transactions) to which reference is made in the statements concerning his “protracted study of Sets” and168 QUATERNIONS AND THE A USDEHNUNGSLEHRK “mode of multiplying couples!' But I cannot find anything like Grassmann’s external or internal multiplication in this memoir, which is concerned, as the title pretty clearly indicates, with the theory of the complex quantities of ordinary algebra. It is difficult to understand the statements respecting the Ausdehn-ungslehre, which seem to imply that Grassmann’s two kinds of multiplication were subject to some kind of limitation to a plane. The external product is not limited in the first Ausdehnungslehre even to three dimensions. The internal, which is a comparatively simple matter, is mentioned in the first Ausdehnungslehre only in the preface, where it is defined, and placed beside the external product as relating to directed lines. There is not the least suggestion of any difference in the products in respect to the generality of their application to vectors. The misunderstanding seems to have arisen from the following sentence in Grassmann’s preface: “ And in general, in the consideration of angles in space, difficulties present themselves, for the complete (allseitig) solution of which I have not yet had sufficient leisure.” It is not surprising that Grassmann should have required more time for the development of some parts of his system, when we consider that Hamilton, on his discovery of quaternions, estimated the time which he should wish to devote to them at ten or fifteen years (see his letter to Prof. Tait m the North British Review for September 1866), and actually took several years to prepare for the press as many pages as Grassmann had printed in 1844. But any speculation as to the questions which Grassmann may have had principally in mind in the sentence quoted, and the particular nature of the difficulties which he found in them, however interesting from other points of view, seems a very precarious foundation for a comparison of the systems of Hamilton and Grassmann as published in the years 1843-44. Such a comparison should be based on the positive evidence of doctrines and methods actually published. Such a comparison I have endeavoured to make, or rather to indicate the basis on which it may be made, so far as systems of geometrical algebra are concerned. As a contribution to analysis in general, I suppose that there is no question that Grassmann’s system is of indefinitely greater extension, having no limitation to any particular number of dimensions.IX. QUATERNIONS AND THE ALGEBRA OF VECTORS. [Nature, vol. xlvii. pp. 463, 464, Mar. 16, 1893.] In a recent number of Nature [vol. xlvii, p. 151], Mr. McAulay puts certain questions to Mr. Heaviside and to me, relating to' a subject of such importance as to justify an answer somewhat at length. I cannot of course speak for Mr. Heaviside, although I suppose that his views are not very different from mine on the most essential points, but even if he shall have already replied before this letter can appear, I shall be glad to add whatever of force may belong to independent testimony. Mr. McAulay asks: “ What is the first duty of the physical vector analyst qua physical vector analyst ? ” The answer is not doubtful. It is to present the subject in such a form as to be most easily acquired, and most useful when acquired. In regard to the slow progress of such methods towards recognition and use by physicists and others, which Mr. McAulay deplores, it does not seem possible to impute it to any want of uniformity of notation. I doubt whether there is any modern branch of mathematics which has been presented for so long a time with a greater uniformity of notation than quaternions. What, then, is the cause of the fact which Mr. McAulay and all of us deplore ? It is not far to seek. We need only a glance at the volumes in which Hamilton set forth his method. No wonder that physicists and others failed to perceive the possibilities of simplicity, perspicuity, and brevity which were contained in a system presented to them in ponderous volumes of 800 pages. Perhaps Hamilton may have intended these volumes as a sort of thesaurus, and we should look to his shorter papers for a compact account of his method. But if we turn to his earlier papers on Quaternions in the Philosophical Magazine, in which principally he introduced the subject to the notice of his contemporaries, we find them entitled “ On Quaternions; or on a New System of Imaginaries in Algebra,” and in them we find a great deal about imaginaries, and very little of a vector analysis. To show how slowly the system of vector analysis developed itself in the quaternionic nidus, we need only say that the symbols S, V, and V do not appear until two or three years after the discovery of quaternions. In short, it seems to have been170 QUATERNIONS AND THE ALGEBRA OF VECTORS. only a secondary object with Hamilton to express the geometrical relations of vectors,—secondary in time, and also secondary in this, that it was never allowed to give shape to his work. But this relates to the past. In regard to the present status, I beg leave to quote what Mr. McAulay has said on another occasion (see Phil. Mag., June 1892):—“ Quaternions differ in an important respect from other branches of mathematics that are studied by mathematicians after they have in the course of years of hard labour laid the foundation of all their future work. In nearly all cases these branches are very properly so called. They each grow out of a definite spot of the main tree of mathematics, and derive their sustenance from the sap of the trunk as a whole. But not so with quaternions. To let these grow in the brain of a mathematician, he must start from the seed as with the rest of his mathematics regarded as a whole. He cannot graft them on his already flourishing tree, for they will die there. They are independent plants that require separate sowing and the consequent careful tending/5 Can we wonder that mathematicians, physicists, astronomers, and geometers feel some doubt as to the value or necessity of something so separate from all other branches of learning? Can that be a natural treatment of the subject which has no relations to any other method, and, as one might suppose from reading some treatises, has only occurred to a single man? Or, at best, is it not discouraging to be told that in order to use the quatemionic method, one must give up the progress which he has already made in the pursuit of his favourite science, and go back to the beginning and start anew on a parallel course ? 1 believe, however, that if what I have quoted is true of vector methods, it is because there is something fundamentally wrong in the presentation of the subject. Of course, in some sense and to some extent it is and must be true. Whatever is special, accidental, and individual, will die, as it should; but that which is universal and essential should remain as an organic part of the whole intellectual acquisition. If that which is essential dies with the accidental, it must be because the accidental has been given the prominence which belongs to the essential. For myself, I should preach no such doctrine to those whom I wish to convert to the true faith. In Italy, they say, all roads lead to Kome. In mechanics, kinematics, astronomy, physics, all study leads to the consideration of certain relations and operations. These are the capital notions; these should have the leading parts in any analysis suited to the subject. If I wished to attract the student of any of these sciences to an algebra for vectors, I should tell him that the fundamental notions of this algebra were exactly those with which he was daily con-QUATERNIONS AND THE ALGEBRA OF VECTORS. 171 versant. I should tell him that a vector algebra is so far from being any one man’s production that half a century ago several were already working toward an algebra which should be primarily geometrical and not arithmetical, and that there is a remarkable similarity in the results to which these efforts led (see Proc. A.A.A.S. for 1886, pp. 37, ff.) [this vol. p. 91, ff.]. I should call his attention to the fact that Lagrange and Gauss used the notation (a/3y) to denote precisely the same as Hamilton by his S(a/3y), except that Lagrange limited the expression to unit vectors, and Gauss to vectors of which the length is the secant of the latitude, and I should show him that we have only to give up these limitations, and the expression (in connection with the notion of geometrical addition) is endowed with an immense wealth of transformations. I should call his attention to the fact that the notation [r-p^], universal in the theory of orbits, is identical with Hamilton’s V(p1p2), except that Hamilton takes the area as a vector, i.e., includes the notion of the direction of the normal to the plane of the triangle, and that with this simple modification (and with the notion of geometrical addition of surfaces as well as of lines) this expression becomes closely connected with the first-mentioned, and is not only endowed with a similar capability for transformation, but enriches the first with new capabilities. In fact, I should tell him that the notions which we use in vector analysis are those wlich he who reads between the lines will meet on every page of the great masters of analysis, or of those who have probed deepest the secrets of nature, the only difference being that the vector analyst, having regard to the weakness of the human intellect, does as the early painters who wrote beneath their pictures “ This is a tree,” “ This is a horse.” I cannot attach quite so much importance as Mr. McAulay to uniformity of notation. That very uniformity, if it existed among those who use a vector analysis, would rather obscure than reveal their connection with the general course of modern thought in mathematics and physics. There are two ways in which we may measure the progress of any reform. The one consists in counting those who have adopted the shibboleth of the reformers; the other measure is the degree in which the community is imbued with the essential principles of the reform. I should apply the broader measure to the present case, and do not find it quite so bad as Mr. McAulay does. Yet the question of notations, although not the vital question, is certainly important, and I assure Mr. McAulay that reluctance to make unnecessary innovations in notation has been a very powerful motive in restraining me from publication. Indeed my pamphlet on Vector Analysis, which has excited the animadversion of quater-nionists, was never formally published, although rather widely172 QUATERNIONS AND THE ALGEBRA OF VECTORS. distributed, so long as I had copies to distribute, among those who I thought might be interested in the subject. I may say, however, since I am called upon to defend my position, that I have found the notations of that pamphlet more flexible than those generally used. Mr. McAulay, at least, will understand what I mean by this, if I say that some of the relations which he has thought of sufficient importance to express by means of special devices (see Proc, R.S.E. for 1890-91), may be expressed at least as briefly in the notations which I have used, and without special devices. But I should not have been satisfied for the purposes of my pamphlet with any notation which should suggest even to the careless reader any connection with the notion of the quaternion. For I confess that one of my objects was to show that a system of vector analysis does not require any support from the notion of the quaternion, or, I may add, of the imaginary in algebra. I should hardly dare to express myself with so much freedom, if I could not shelter myself behind an authority which will not be questioned. I do not see that I have done anything very different from what the eminent mathematician upon whom Hamilton’s mantle has fallen has been doing, it would seem, unconsciously. Contrast the system of quaternions, which he has described in his sketch of Hamilton’s life and work in the North British Review for September, 1866, with the system which he urges upon the attention of physicists in the Philosophical Magazine in 1890. In 1866 we have a great deal about imaginarles, and nearly as much about the quaternion. In 1890 we have nothing about imaginaries, and little about the quaternion. Prof. Tait has spoken of the calculus of quaternions as throwing off in the course of years its early Cartesian trammels. I wonder that he does not see how well the progress in which he has led may be described as throwing off the yoke of the quaternion. A characteristic example is seen in the use of the symbol V. Hamilton applies this to a vector to form a quaternion, Tait to form a linear vector function. But while breathing a new life into the formulae of quaternions, Prof. Tait stands stoutly by the letter. Now I appreciate and admire the generous loyalty toward one whom he regards as his master, which has always led Prof. Tait to minimise the originality of his own work in regard to quaternions, and write as if everything was contained in the ideas which flashed into the mind of Hamilton at the classic Brougham Bridge. But not to speak of other claims of historical justice, we owe duties to our scholars as well as to our teachers, and the world is too large, and the current of modern thought is too broad, to be confined by the ipse dixit even of a Hamilton.x. QUATERNIONS AND VECTOR ANALYSIS. [Nature, vol. xlviii. pp. 364-367, Aug. 17, 1893.] In a paper by Prof. C. G. Knott on “Recent Innovations in Vector Theory/’ of which an abstract has been given in Nature (vol. xlvii, pp. 590-593; see also a minor abstract on p. 287), the doctrine that the quaternion affords the only sufficient and proper basis for vector analysis is maintained by arguments based so largely on the faults and deficiencies which the author has found in my pamphlet, Elements of Vector Analysis, as to give to such faults an importance which they would not otherwise possess, and to make some reply from me necessary, if I would not discredit the cause of non-quaternionic vector analysis. Especially is this true in view of the warm commendation and endorsement of the paper, by Prof. Tait, which appeared in Nature somewhat earlier (p. 225). The charge which most requires a reply is expressed most distinctly in the minor abstract, viz., “that in the development of his dyadic notation, Prof. Gibbs, being forced to bring the quaternion in, logically condemned his own position.” This was incomprehensible to me until I received the original paper, where I found the charge specified as follows: “ Although Gibbs gets over a good deal of ground without the explicit recognition of the complete product, which is the difference of his ‘ skew ’ and ‘ direct ’ products, yet even he recognises in plain language the versorial character of a vector, brings in the quaternion whose vector is the difference of a linear vector function and its conjugate, and does not hesitate to use the accursed thing itself in certain line, surface, and volume integrals” (Proc. B.S.E., Session 1892-3, p. 236). These three specifications I shall consider in their inverse order, premising, however, that the epitheta ornantia are entirely my critic’s. The last charge is due entirely to an inadvertence. The integrals referred to are those given at the close of the major abstract in Nature (p. 593). My critic, in his original paper, states quite correctly that, according to my definitions and notations, they should represent dyadics. He multiplies them into a vector, introducing the vector under the integral sign, as is perfectly proper, provided, of course, that the vector is constant. But failing to observe this restriction,174 QUATERNIONS AND VECTOR ANALYSIS. evidently through inadvertence, and finding that the resulting equations (thus interpreted) would not be true, he concludes that I must have meant something else by the original equations. Now, these equations will hold if interpreted in the quaternionic sense, as is, indeed, a necessary consequence of their holding in the dyadic sense, although the converse would not be true. My critic was thus led, in consequence of the inadvertence mentioned, to suppose that I had departed from my ordinary usage and my express definitions, and had intended the products in these integrals to be taken in the quaternionic sense. This is the sole ground for the last charge. The second charge evidently relates to the notations <3?s and <3?x (see Nature, vol. xlvii, p. 592). It is perfectly true that I have used a scalar and a vector connected with the linear vector operator, which, if combined, would form a quaternion. I have not thus combined them, Perhaps Prof. Knott will say that since I use both of them it matters little whether I combine them or not. If so I heartily agree with him. The first charge is a little vague. I certainly admit that vectors may be used in connection with and to represent rotations I have no objection to calling them in such cases versorial. In that sense Lagrange and Poinsot, for example, used versorial vectors. But what has this to do with quaternions ? Certainly Lagrange and Poinsot were not quaternionists. The passage in the major abstract in Nature which most distinctly charges me with the use of the quaternion is that in which a certain expression which I use is said to represent the quaternion operator q( )q~1 (vol. xlvii, p. 592). It would be more accurate to say that my expression and the quaternionic expression represent the same operator. Does it follow that I have used a quaternion ? Not at all. A quaternionic expression may represent a number. Does everyone who uses any expression for that number use quaternions? A quaternionic expression may represent a vector. Does everyone who uses any expression for that vector use quaternions ? A quaternionic expression may represent a linear vector operator. If I use an expression for that linear vector operator do I therefore use quaternions ? My critic is so anxious to prove that I use quaternions that he uses arguments which would prove that quaternions were in common use before Hamilton was born. So much for the alleged use of the quaternion in my pamphlet. Let us now consider the faults and deficiencies which have been found therein and attributed to the want of the quaternion. The most serious criticism in this respect relates to certain integrating operators, which Prof. Tait unites with Prof. Knott in ridiculing. As definitions are wearisome, I will illustrate the use of the terms and notationsQUATERNIONS AND VECTOR ANALYSIS. 175 which I have used by quoting a sentence addressed to the British Association a few years ago. The speaker was Lord Kelvin. “ Helmholtz first solved the problem—Given the spin in any case of liquid motion, to find the motion. His solution consists in finding the potentials of three ideal distributions of gravitational matter having densities respectively equal to I/tt of the rectangular components of the given spin; and, regarding for a moment these potentials as rectangular components of velocity in a case of liquid motion, taking the spin in this motion as the velocity in the required motion ” {Nature, vol. xxxviii, p. 569). In the terms and notations of my pamphlet the problem and solution may be thus expressed : Given the curl in any case of liquid motion—to find the motion. The required velocity is 1/4tt of the curl of the potential of the given curl. ^ Or, more briefly—The required velocity is ^ of the Laplacian of the given curl. Or in purely analytical form—Required go in terms of Vxw, when V.ft> = 0. Solution: go = l/47rVxPot V xa> = 1/47r Lap V X<*>. (The Laplacian expresses the result of an operation like that by which magnetic force is calculated from electric currents distributed in space. This corresponds to the second form in which Helmholtz expressed his result.) To show the incredible rashness of my critics, I will remark that these equations are among those of which it is said in the original paper (Proe. R.S.&, Session 1892-93, p. 225), “Gibbs gives a good many equations—theorems I suppose they ape at being/’ I may add that others of the equations thus characterized are associated with names not less distinguished than that of Helmholtz. But that to which I wish especially to call attention is that the terms and notations in question express exactly the notions which physicists want to use. But we are told {Nature, vol. xlvii, p. 287) that these integrating operators (Pot, Lap) are best expressed as inverse functions of V. To see how utterly inadequate the Nabla would have been to express the idea, we have only to imagine the exclamation points which the members of the British Association would have looked at each other if the distinguished speaker had said: Helmholtz first solved the problem—Given the Nabla of the velocity in any case of liquid motion, to find the velocity. His solution was that the velocity was the Nabla of the inverse square of Nabla of the Nabla of the velocity. Or, that the velocity was the inverse Nabla of the Nabla of the velocity.176 QUATERNIONS AND VECTOR ANALYSIS. Or, if the problem and solution had been written thus: Required a> in terms of Vc*> when SVto = 0. Solution: a> = VV ~ 2Vc*> = V -1 Vo>. My critic has himself given more than one example of the unfitness of the inverse Nabla for the exact expression of thought. For example, when he says that I have taken “eight distinct steps to prove two equations, which are special cases of V-2V2^ = ^,” I do not quite know what he means. If he means that I have taken eight steps to prove Poisson’s Equation (which certainly is not expressed by the equation cited, although it may perhaps be associated with it in some minds), I will only say that my proof is not very long, especially as I have aimed at greater rigor than is usually thought necessary. I cannot, however, compare my demonstration with that of quaternionic writers, as I have not been able (doubtless on account of insufficient search) to find any such. To show how little foundation there is for the charge that the deficiencies of my system require to be pieced out by these integral operators, I need only say that if I wished to economise operators I might give up New, Lap, and Max, writing for them VPot, VxPot, and V. Pot, and if I wished further to economise in what costs so little, I could give up the potential also by using the notation (V.V)"1 or V"2. That is, I could have used this notation without greater sacrifice of precision than quaternionic writers seem to be willing to make. I much prefer, however, to avoid these inverse operators as essentially indefinite. Nevertheless—although my critic has greatly obscured the subject by ridiculing operators, which I beg leave to maintain are not worthy of ridicule, and by thoughtlessly asserting that it was necessary for me to use them, whereas they are only necessary for me in the sense in which something of the kind is necessary for the quaternionist alsoj if he would use a notation irreproachable on the score of exactness— I desire to be perfectly candid. I do not wish to deny that the relations connected with these notations appear a little more simple in the quaternionic form. I had, indeed, this subject principally in mind when I said two years ago in Nature (vol. xliii, p. 512) [this vol. p. 158]: “ There are a few formulas in which there is a trifling gain in compactness in the use of the quaternion.” Let us see exactly how much this advantage amounts to. There is nothing which the most rigid quaternionist need object to in the notation for the potential, or indeed for the Newtonian. These represent respectively the operations by which the potential or the force of gravitation is calculated from the density of matter. A quaternionist would, however, apply the operator New not only to aQUATERNIONS AND VECTOR ANALYSIS. 177 scalar, as I have done, but to a vector also. The vector part of New w (construed in the quaternionic sense)'would be exactly what I have represented by Lap co, and the scalar part, taken negatively, would be exactly what I have represented by Maxw. The quaternionist has here a slight economy in notations, which is of less importance, since all the operators—New, Lap, Max—may be expressed without ambiguity in terms of the potential, which is therefore the only one necessary for the exact expression of thought. But what are the formulae which it is necessary for one to remember who uses my notations ? Evidently only those which contain the operator Pot. For all the others are derived from these by the simple substitutions New = V Pot, Lap =VxPot, Max = V. Pot. Whether one is quaternionist or not, one must remember Poisson’s Equation, which I write V.V Pot« = — 4)7Tft>, and in quaternionic might be written V2 Pot ft) = 4*7Tft). If , just as the square of the tensor of q might be defined as the product of the latent roots of q. Again, it has the property represented by the equation which corresponds exactly with the preceding equation with both sides squared.180 QUATERNIONS AND VECTOR ANALYSIS. There is another scalar quantity connected with the quaternion and represented by the notation Bq. It has the important property expressed by the equation, S (qrs) = S (rsq) = S (sqr), and so for products of any number of quaternions, in which the cyclic order remains unchanged. In the theory of the linear vector operator there is an important quantity which I have represented by the notation <1>S, and which has the property represented by the equation where the number of the factors is as before immaterial. d?g may be defined as the sum of the latent roots of 3?, just as 2Sq may be defined as the sum of the latent roots of q. The analogy of these notations may be further illustrated by comparing the equations T(e«) = eB* and |g*| = e*s. I do not see why it is not as reasonable for the vector analyst to have notations like | 3? | and 3>g, as for the quatemionist to have the notations Tq and Bq. This is of course an argumentum ad quaternionisten. I do not pretend that it gives the reason why I used these notations, for the identification of the quaternion with a matrix was, I think, unknown to me when I wrote my pamphlet. The real justification of the notations |d?| and $s is that they express functions of the linear vector operator qud quantity, which physicists and others have continually occasion to use. And this justification applies to other notations which may not have their analogues in quaternions. Thus I have used to express a vector so. important in the theory of the linear vector operator, that it can hardly be neglected in any treatment of the subject. It is described, for example, in treatises as different as Thomson and Tait’s Natural Philosophy and Kelland and Tait’s Quaternions. In the former treatise the components of the vector are, of course, given in terms of the elements of the linear vector operator, which is in accordance with the method of the treatise. In the latter treatise the vector is expressed by Vaa' + V#8' + V yy'. As this supposes the linear vector operator to be given not by a single letter, but by several vectors, it must be regarded as entirely inadequate by any one who wishes to treat fihe subject in the spirit of multiple algebra, %.e. to use a single letter to represent the linear vector operator. But my critic does not like the notations | $ |, 3>g, d?x. His ridicule,QUATERNIONS AND VECTOR ANALYSIS. 181 indeed reaches high-water mark in the paragraphs in which he mentions them. Concerning another notation, (defined in Nature, vol. xliii, p. 518) [this vol., p. 160], he exclaims, “ Thns burden alter burden, in the form of new notation, is added apparently for the sole purpose of exercising the faculty of memory.” He would vastly prefer, it would appear, to write with Hamilton “ where m represents what the unit volume becomes under the influence of the linear operator.” But this notation is only apparently compact, since the m requires explanation. Moreover, if a strain were given in what Hamilton calls the standard trinomial form, to write out the formula for the operator on surfaces in that standard form by the use of the expression mtf-1 would require, it seems to me, ten (if not fifty) times the effort of memory and of ingenuity, which would be required for the same purpose with the use of I may here remark that Prof. Tait’s letter of endorsement of Prof. Knott’s paper affords a striking illustration of the convenience and flexibility of a notation entirely analogous to viz., He gives the form SV^ So-oq to illustrate the advantage of quaternionic notations in point of brevity. If I understand his notation, this is what I should write V, which is practically identical with my (viz., the operator which expresses the relation between d|, or ($Xp)s, or (3>Xp)x, he would, I think find the operand very much in the way.XI. ON DOUBLE REFRACTION AND THE DISPERSION OF COLORS IN PERFECTLY TRANSPARENT MEDIA. [American Journal of Science, ser. 3, vol. xxm, pp. 262-275, April, 1882.] 1. In calculating the velocity of a system of plane waves of homogeneous light, regarded as oscillating electrical fluxes, in transparent and sensibly homogeneous bodies, whether singly or doubly refracting, we may assume that such a body is a very fine-grained structure, so that it can be divided into parts having their dimensions very small in comparison with the wave-length, each of which may be regarded as entirely similar to every other, while in the interior of each there are wide differences in electrical as in other physical properties. Hence, the average electrical displacement in such parts of the body may be expressed as a function of the time and the coordinates of position by the ordinary equations of wave-motion, while the real displacement at any point will in general differ greatly from that represented by such equations. It is the object of this paper to investigate the velocity of light in perfectly transparent media which have not the property of circular polarization in a manner which shall take account of this difference between the real displacements and those represented by the ordinary equations of wave-motion. We shall find that this difference will account for the dispersion of colors, without affecting the validity of the laws of Huyghens and Fresnel for double refraction with respect to light of any one color. In this investigation, it is assumed that the electrical displacements are solenoidal, or, in other words, that they are such as not to produce any change in electrical density. The disturbance in the medium is treated as consisting entirely of such electrical displacements and fluxes, and not complicated by any distinctively magnetic phenomena. It might therefore be more accurate to call the theory (as here developed) electrical rather than electromagnetic. The latter term is nevertheless retained in accordance with general usage, and with that of the author of the theory. Since the velocity which we are seeking is equal to the wave-length divided by the period of oscillation, the problem reduces to finding the ratio of these quantities, and may be simplified in some respectsDOUBLE KEFKACTION, ETC. 183 by supposing that we have to do with a system of stationary waves. That the relation of the wave-length and the period is the same for stationary as for progressive waves is evident from the consideration that a system of stationary waves may be formed by two systems of progressive waves having opposite directions. 2. Let x, y, z be the rectangular coordinates of any point in the medium, which with the system of waves we may regard as indefinitely extended, and let tj + rf, £+£' be the components of electrical displacement at that point at the time t; £, tj, f being the average values of the components of electrical displacement at that time in a wave-plane passing through the point. Then £ tj, f, £', rf, are perfectly defined quantities, of which £ tj, f are connected with x, y, and t by the ordinary equations of wave-motion, while each of the quantities £', tj', has always zero for its average value in any wave-plane. We may call £ tj, f the components of the regular part of the displacement, and £', tj', the components of the irregular part of the displacement. In like manner, the differential coefficients of these quantities with respect to the time, £ rj, £ £, rj, £, may be called respectively the components of the regular part of the flux, and the components of the irregular part of the flux. Let the whole space be divided into elements of volume Du, very small in all dimensions in comparison with a wave-length, but enclosing portions of the medium which may be treated as entirely similar to one another, and therefore not infinitely small. Thus a crystal may be divided into elementary parallelopipeds, all the vertices of which are similarly situated with respect to the internal structure of the crystal. Amorphous solids and liquids may not be capable of division into equally small portions of which physical similarity can be predicated with the same rigor. Yet we may suppose them capable of a division substantially satisfying the requirements. From these definitions it follows that at any given instant the average value of each of the quantities £, tj', £ in an element Du is zero. For the average value in one such element must be sensibly the same as in any other situated on the same wave-plane. If this average were not zero, the average for the wave-plane would not be zero. Moreover, at any given instant, the values of £.jy, £ may be regarded as constant throughout any element Du, and as representing the average values of the components of displacement in that element. The same will be true of the quantities £, tj', £ and £ tj, £ 3. Since We have excluded the case of media which have the property of circular polarization, we shall not impair the generality of our results if we suppose that we have to do with linearly polarized light, i.e., that the regular part of the displacement is everywhere parallel to the same fixed line, all cases not already excluded being184 DOUBLE REFRACTION AND THE DISPERSION OF reducible to this. Then, with the origin of coordinates and the zero of time suitably chosen, the regular part of the displacement may be represented by the equations £= a cos 2ttt cos 27T- , b l p Hb t = 8 cos 2irr cos 27t- , . 1 ^ l p a) £ = y COS 2ttt COS 27T- , 5 / l p where l denotes the wave-length, p the period of vibration, a, fi, y the maximum amplitudes of the displacements £ t], £, and u the distance of the point considered from the wave-plane which passes through the origin. Since u is a linear function of x, y, and 0, we may regard these equations as giving the values of f, tj, £, for a given system of waves, in terms of x, y, 0, and t. 4. The components of the irregular displacement, 1£', at any given point, will evidently be simple harmonic functions of the time, having the same period as the regular part of the displacement. That they will also have the same phase is not quite so evident, and would not be the case in a medium in which there were any absorption or dispersion of light. It will however appear from the following considerations that in perfectly transparent media the irregular oscillations are synchronous with the regular. For if they are not synchronous, we may resolve the irregular oscillations into two parts, of which one shall be synchronous with the regular oscillations, and the other shall have a difference of phase of one-fourth of a complete oscillation. Now if the medium is one in which there is no absorption or dispersion of light, we may assume that the same electrical configurations may also be passed through in the inverse order, which would be represented analytically by writing — t for t in the equations which give £ rj, £, rf, £', as functions of x, y, 0, and t. But this change would not affect the regular oscillations, nor the synchronous part of the irregular oscillations, which depends on the cosine of the time, while the non-synchronous part of the irregular oscillations, which depends on the sine of the time, would simply have its direction reversed. Hence, by taking first one-half the sum, and secondly one-half the difference, of the original motion and that obtained by substitution of — t for t, we may separate the non-synchronous part of the irregular oscillations from the rest of the motion. Therefore, the supposed non-synchronous part of the irregular displacement, if capable of existence, is at least wholly independent of the wave-motion and need not be considered by us. We may go farther in the determination of the quantities t]', For in view of the very fine-grained structure of the medium, it willCOLORS IN PERFECTLY TRANSPARENT MEDIA. 185 easily appear that the manner in which the general or average flux in any element Dv (represented by £ r], f) distributes itself among the molecules and intermolecular spaces must be entirely determined by the amount and direction of that flux and its period of oscillation. Hence, and on account of the superposable character of the motions which we are considering, we may conclude that the values of rj at any given point in the medium are capable of expression as linear functions of £, t}, f in a manner which shall be independent of the time and of the orientation of the wave-planes and the distance of a nodal plane from the point considered, so long as the period of oscillation remains the same. But a change in the period may presumably affect the relation between f ' and £ rj, Ç to a certain extent. And the relation between rf, and £, q, £ will vary rapidly as we pass from one point to another within the element Dv. 5. In the motion which we are considering there occur alternately instants of no velocity and instants of no displacement. The statical energy of the medium at an instant of no velocity must be equal to its kinetic energy at an instant of no displacement. Let us examine each of these quantities, and consider the equation which expresses their equality. 6. Since in every part of an element Dv the irregular as well as the regular part of the displacement is entirely determined (for light of a given period) by the values of £ rj, £ the statical energy of the element must be a quadratic function of £ rj, £ say ( A£2+Bif+Cf2 + + Ff£-f Ggq) Dv, where A, B, etc. depend only on the nature of the medium and the period of oscillation. At an instant of no velocity, when sin 2ir - = 0, and cos22tt - = 1, P P thé above expression will reduce by equations (1) to (Aa2 -f- B/32 -f Cy2+Eßy + Fya+Gaß) cos2 2x j Dv. Since the average value of cos227rj in an indefinitely extended space is we have for the statical energy in a unit of volume S = I ( A«2 + Bß2 + O2 + Eßy -f Fy a -h G aß). (2) 7. The kinetic energy of the whole medium is represented by the double volume-integral* dvl dv2, * The fluxes are supposed to be measured by the electromagnetic system of units. It is to be observed that the difference of opinion which has prevailed with respect to the estimation of the energy of electrical currents does not extend to such as are solenoidal, which may be regarded as composed of closed circuits.186 DOUBLE REFRACTION AND THE DISPERSION OF where dv1} dv2 are two infinitesimal elements of volume, (£+ £')2 the corresponding components of flux, r the distance between the elements, and 2 denotes a summation with respect to the coordinate axes. Separating the integrations, we may write for the same quantity It is evident that the integral within the brackets is derived from £+£' by the same process by which the potential of any mass is derived from its density. If we use the symbol Pot to express this relation, we may write for the kinetic energy The operation denoted by this symbol is evidently distributive, so that ?ot(£+£') = ?ot£+?ot£'. The expression for the kinetic energy may therefore be expanded into *2/ i Pot idv+Pot dv+*2/ ¿’Tabidv+iSff-Pobfdv. But £'} and therefore Pot £\ has in every wave-plane the average value zero. Also £, and therefore Pot £ has in every wave-plane a constant value. Therefore the second and third integrals in the above expression will vanish, leaving for the kinetic energy iZfiVot£dv+l2fi'-Poti'dv, (3) which is to be calculated for a time of no displacement, when • 2ira 0 ù t — -}_-----cos 2 7r * P V , 2 x/3 n U : 27Ty ■■ ± z- cos 27t p ç = -*---L P cos2x^-. (4) p l v J The form of the expression (3) indicates that the kinetic energy consists of two parts, one of which is determined by the regular part of the flux, and the other by the irregular part of the flux. 8. The value of Pot £ may be easily found by integration, but perhaps more readily by Poisson’s well-known theorem, that if q is any function of position in space (as the density of a certain mass), MVotq (PPotg d*?otg _ _ dx2 + dyi dz1 ~ q’ {) where the direction of the coordinate axes is immaterial, provided that they are rectangular. In applying this to Pot £, we may place two of the axes in a wave-plane. This will give dr Pot £ . % (6) In a nodal plane, Pot^=0, since £ has equal positive and negative values in elements of volume symmetrically distributed with respectCOLORS IN PERFECTLY TRANSPARENT MEDIA. 187 to any point in such a plane. In a wave-crest (or plane in which £ has a maximum value), Pot£ will also have a maximum value, which we may call K. For intermediate points we may determine its value from the consideration that the total disturbance may be resolved into two systems of waves, one having a wave-crest, and the other a nodal plane passing through the point for which the potential is sought. The maximum amplitudes of these component systems will be to the maximum amplitude of the original system as cos 27r j and sin 27r j to unity. But the second of the component systems will contribute nothing to the value of the potential. We thus obtain Pot £= K cos 2^, cZ2Pot £ _ du2 ~~ 47r2xr 0 n ---p- K COS Z7Tj - 4s7T2 ~w Pot Comparing this with equation (6), we have Pot#=^. (7) Hence, and by equations (4), |2/£Pot £dv = ^fedv = ^ (a*+j8*+y*)/coi 2 **dv. The kinetic energy of the regular part of the flux is therefore, for each unit of volume, 72 T = ^(a2+/3*+y2). (8) 9. With respect to the kinetic energy of the irregular part of the flux, it is to be observed that, since if, have their average values zero in spaces which are very small in comparison with a wave-length, the integrations implied in the notations Pot ¿', Pot if, Pot £' may be confined to a sphere of a radius which is small in comparison with a wave-length. Since within such a sphere if, £' are sensibly determined by the values of r\, f at the center of the sphere, which is the point for which the value of the potentials are sought, Pot Pot rf, Pot £' must be functions—evidently linear functions—of ¿, rj, £; and g' Pot if Pot if, Pot must be quadratic functions of the same quantities. But these functions will vary with the position of the point considered with reference to the adjacent molecules. Now the expression for the kinetic energy of the irregular part of the flux, JS/fPot $'dv,188 DOUBLE REFRACTION AND THE DISPERSION OF indicates that we may regard the infinitesimal element dv as having the energy (due to this part of the flux) Potf dv. Let us consider the energy due to the irregular flux which will belong to the above defined element Dv, which is not infinitely small, but which has the advantage of being one of physically similar elements which make up the whole medium. The energy of this element is found by adding the energies of all the infinitesimal elements of which it is composed. Since these are quadratic functions of the quantities i), f, which are sensibly constant throughout the element Dv, the sum will be a quadratic function of ¿, ij, £ say (A'f+B^+Ct2+E'^+F^+G'^)Dt;, which will therefore represent the energy of the element Dv due to the irregular flux. The coefficients A', B', etc., are determined by the nature of the medium and the period of oscillation. They will be constant throughout the medium, since one element Dv does not differ from another. This expression reduces by equations (4) to ^-(A'«2+B'/32+cy+E'/3y+F'ya+G'a/3) cos2 2x| Dv. The kinetic energy of the irregular flux in a unit of volume is therefore 92 T' = -“T (A'a2 + B'/32 4“ C'y2 + E'/3y + F'ya + G'a/3). (9) 10. Equating the statical and kinetic energies, we have \ ( Aa2 4- B/32 + Cy2 + E/3y + Fya + Gaß) = ÿ(a2+/32+y2) + ^2(A,«2 + B'/82 + Cy+E'/8y+Fya + G'a^). (10) The velocity (V) of the corresponding system of progressive waves is given by the equation V2_£2_ 1 Aa24-B/32j-Cy24“E/3y4-Fya4-Ga/3 ~p2~~ 27t a24-/324~y2 27r A!a2+B'/?2+C'y2 4~ E'/3y + F'ya -j- G'aß r a24-/324-y2 If we set a = 6 = B', etc., 2 w p2 2 irp2 (11) (12) and p2=a?+ß2+y2, the equation reduces to y2 +cy2+eßy +/ya +gaß (13)COLORS IN PERFECTLY TRANSPARENT MEDIA. 189 For a given medium and light of a given period, the coefficients a, b, etc., are constant. This relation between the velocity of the waves and the direction of oscillation is capable of a very simple geometrical expression. Let r be the radius vector of the ellipsoid ax2+by'l-\-cz2+eyz+fzx+gxy = 1. (14) Then 1_ = ax2 + by9-+czi+eyz +fzx+gxy /y>2 If this radius is drawn parallel to the electrical oscillations, we shall have x_a V _§ r~~ p' r~ p’ r~~ p’ and (15) That is, the wave-velocity for any particular direction of oscillation is represented in the ellipsoid by the reciprocal of the radius vector which is parallel to that direction. 11. This relation between the wave-length, the period, and the direction of vibration, must hold true not only of such vibrations as actually occur, but also of such as we may imagine to occur under the influence of constraints determining the direction of vibration in the wave-plane. The directions of the natural or unconstrained vibrations in any wave-plane may be determined by the general mechanical principle that if the type of a natural vibration is infinitesimally altered by the application of a constraint, the value of the period will be stationary.* Hence, in a system of stationary waves such as we have been considering, if the direction of an unconstrained vibration is infinitesimally varied in its wave-plane by a constraint while the wave-length remains constant, the period will be stationary. Therefore, if the direction of the unconstrained vibration is infinitesimally varied by constraint, and the period remains rigorously constant, the wave-length will be stationary. Hence, if we make a central section of the above described ellipsoid parallel to any wave-plane, the directions of natural vibration for that wave-plane will be parallel to the radii vectores of stationary value ip that section, viz., to the axes of the ellipse, when the section is elliptical, or to all radii, when the section is circular. 12. For light of a single period, our hypothesis has led to a perfectly definite result, our equations expressing the fundamental laws of double refraction as enunciated by Fresnel. But if we ask how the velocity of light varies with the period, that is, if we seek See Rayleigh’s Theory of Sound, vol, i, p, 84.190 DOUBLE REFRACTION AND THE DISPERSION OF to derive from the same equations the laws of the dispersion of colors, we shall not be able to obtain an equally definite result, since the quantities A, B, etc., and A', B', etc., are unknown functions of the period. If, however, we make the assumption, which is hardly likely to be strictly accurate, but which may quite conceivably be not far removed from the truth, that the manner in which the general or average flux in any small part of the medium distributes itself among the molecules and intermolecular spaces is independent of the period, the quantities A, B, etc., and A', B', etc., will be constant, and we obtain a very simple relation between V and p, which appears to agree tolerably well with the results of experiment. If we set H = ^tg^+^i^y+Fya + Gaff p an(J Ft' — ^0,2 cy -jr B'fiy + Fyq -f- G'a/j an - ^ our general equation (11) becomes (16) (17) H 2ttW 2tt p2 ’ (18) where H and H' will be constant for any given direction of oscillation, when A, B, etc., and A', B', etc., are constant. If we wish to introduce into the equation the absolute index of refraction (n) and the wavelength in vacuo (X) in place of V and p, we may divide both sides of the equation by the square of thej constant (&) representing the velocity of light in vacuo. Then, since Z = —, and kp — \, k n r our equation reduces to 1 H 2ttW n2 2irk2 X2 (19) It is well known that the relation between n and X may be tolerably well but by no means perfectly represented by an equation of this form. 13. If we now give up the presumably inaccurate supposition that A, B, etc., and A', B", etc., are constant, equation (19) will still subsist, but H and H' will not be constant for a given direction of oscillation, but will be functions of p, or, what amounts to the same, of X. Although we cannot therefore use the equation to derive a priori the relation between n and X, we may use it to derive the values of H and H' from the empirically determined relation between n and X. To do this, we must make use again of the general principle that anCOLORS IN PERFECTLY TRANSPARENT MEDIA. 191 infinitesimal variation in the type of a vibration, due to a constraint, will not affect the period. If we first consider a certain system of stationary waves, then a system in which the wave-length is greater by an infinitesimal dl (the direction of oscillation remaining the same), the period will be increased by an infinitesimal dp, and the manner in which the flux distributes itself among the molecules and intermole-cular spaces will presumably be infinitesimally changed. But if we suppose that in the second system of waves there is applied a constraint compelling the flux to distribute itself in the same way among the molecules and intermolecular spaces as in the first system (so that r(, f' shall be the same functions as before of £ rj, £—a supposition perfectly compatible with the fact that the values of £ rj, f are changed), this constraint, according to the principle cited, will not affect the period of oscillation. Our equations will apply to such a constrained type of oscillation, and A, B, etc., and A', B', etc., and therefore H and H', will have the same values in the last described system of waves as in the first system, although the wave-length and the period have been varied. Therefore, in differentiating equation (18) , which is essentially an equation between l and p, or its equivalent (19) , we may treat H and H' as constant. This gives 2 dn_4nrW We thus obtain the values of IT and H X3 dn Tj- _ 2 irk2 ^ dn 2 7rn3 d\’ ~ n2 7i3 dX (20) By determining the values of H and IT for different directions of oscillation, we may determine the values of A, B, etc., and A', B', etc. By means of these equations, the ratios of the statical energy (S), the kinetic energy due to the regular part of the flux (T), and the kinetic energy due to the irregular part of the flux (T'), are easily obtained in a form which admits of experimental determination. Equations (8) and (9) give Therefore, by (20), p2 ’ T': 27T2H>2 p2 T' _ 27rH' __ 2irQ!n2 _ X dn _ d log n T---TT~ X^~~nd\~ dlogX S_T+T'_- T'_dlogX — dlogn__ dlogl T““T~-1 + T_ dîôgX '~JïôgX* (21) (22) T'_ d log n S~ dï^T' (23)192 DOUBLE REFRACTION AND THE DISPERSION OF Since S, T, and T' are essentially positive quantities, their ratios must be positive. Equation (21) therefore requires that the index of refraction shall increase as the period or wave-length in vacuo diminishes. Experiment has shown no exceptions to this rule, except such as are manifestly attributable to the absorption of light. 14. It remains to consider the relations between the optical properties of a medium and the planes or axes of symmetry which it may possess. If we consider the statical energy per unit of volume (S) and the period as constant, we may regard equation (2) as the equation of an ellipsoid, the radii vectores of which represent in direction and magnitude the amplitudes of systems of waves having the same statical energy. In like manner, if we consider the kinetic energy of the irregular part of the flux per unit of volume (T') and the period as constant, we may regard equation (9) as the equation of an ellipsoid, the radii vectores of which represent in direction and magnitude the amplitudes of systems of waves having the same kinetic energy due to the irregular part of the flux. These ellipsoids, which we may distinguish as the ellipsoids (A, B, etc.) and (A7, B', etc.), as well as the ellipsoid before described, which we may call the ellipsoid (a, b, etc.), must be independent in their form and their orientation of the directions of the axes of coordinates, being determined entirely by the nature of the medium and the period of oscillation. They must therefore possess the same kind of symmetry as the internal structure of the medium. If the medium is symmetrical about a certain axis, each ellipsoid must have an axis parallel to that. If the medium is symmetrical with respect to a certain plane, each ellipsoid must have an axis at right angles to that plane. If the medium after a revolution of less than 180° about a certain axis is then equivalent to the medium in its first position, or symmetrical with it with respect to a plane at right angles to that axis, each ellipsoid must have an axis of revolution parallel to that axis. These relations must be the same for light of all colors, and also for all temperatures of the medium. 15. From these principles we may infer the optical characteristics of the different crystallographic systems. In crystals of the isometric system, as in amorphous bodies, the three ellipsoids reduce to spheres. Such media are optically isotropic at least so far as any properties are concerned which come within the scope of this paper. In crystals of the tetragonal or hexagonal systems, the three ellipsoids will have axes of rotation parallel to the principal crystallographic axis. Since the ellipsoid (a, b} etc.) has but one circular section, there will be but one optic axis, which will have a fixed direction.COLORS IN PERFECTLY TRANSPARENT MEDIA, 193 In crystals of the orthorhombic system, the three ellipsoids will have their axes parallel to the rectangular crystallographic axes. If we take these directions for the axes of coordinates, E, F, G, E7, F', G', 6, /, g will vanish and equation (13) will reduce to 172 _ aa2 -f Ò/32+cy2 P2 If the coordinate axes are so placed that a>b> c, the optic axes will lie in the X-Z plane, making equal angles with the axis of Z, which may be determined by the equation ■~ & ... P\A - B) - W(A' - B') To get a rough idea of the manner in which varies with the period, we may regard A, B, C, A', B7, C' as constant in this equation. But since the lengths of the axes of the ellipsoid (a, b, etc.) vary with the period, it may easily happen that the order of the axes with respect to magnitude is not the same for all colors. In that case, the optic axes for certain colors will lie in one of the principal planes, and for other colors in another. For the color at which the change takes place, the two optic axes will coincide. The differential coefficient ^ becomes infinitely great as the optic axes approach coincidence. In crystals of the monoclinic system, each of the three ellipsoids will have an axis perpendicular to the plane of symmetry. We may choose this direction for the axis of X. Then F, G, F', G7, /, g, will vanish and equation (13) will reduce to y 2 _ aa2+fy32+cy2 + efiy The angle 6 made by one of the axes of the ellipsoid (a, b, etc.) in the plane of symmetry with the axis of Y and measured toward the axis of Z, is determined by the equation tan 28 — 6 - 2E - 4?r2E/ c-6 ~~p2(G - B) - 47r2(C' - B'y To get a rough idea of' the dispersion of the axes of the ellipsoid (a, b, etc.) in the plane of symmetry, we may regard B, C, E, B7, C7, E7, as constant in this equation, and suppose the axis of Y so placed as to make E vanish. It is evident that in this system the plane of the optic axes will be fixed, or will rotate about one of the lines which bisect the angles made by the optic axes, according as the mean axis of the ellipsoid194 DOUBLE REFRACTION, ETC. (a, b, etc.) is perpendicular to the plane of symmetry or lies in that plane. In the first case the dispersion of the two optic axes will be unequal. The same crystal, however, with light of different colors, or at different temperatures, may afford an example of each case. In crystals of the triclinic system, since the ellipsoids (A, B, etc.) and (A', B', etc.) are determined by considerations of a different nature, and there are no relations of symmetry to cause a coincidence in the directions of their axes, there will not in general be any such coincidence. Therefore the three axes of the ellipsoid (a, b, etc.), that is, the two lines which bisect the angles of the optic axes and their common normal, will vary in position with the color of the light. 16. It appears from this foregoing discussion that by the electromagnetic theory of light we may not only account for the dispersion of colors (including the dispersion of the lines which bisect the angles of the optic axes in doubly refracting media), but may also obtain Fresnel’s laws of double refraction for every kind of homogeneous light without neglect of the quantities which determine the dispersion of colors. But a closer approximation than that of this paper will be necessary to explain the phenomena of circularly polarizing media, which depend on very minute differences of wave-velocity, represented perhaps by a few units in the sixth significant figure of the index of refraction. That the degree of approximation which will give the laws of circular and elliptic polarization will not add any terms to the equations of this paper, except such as vanish for media which do not exhibit this phenomenon, will be shown in another number of this Journal.XII. ON DOUBLE REFRACTION IN PERFECTLY TRANSPARENT MEDIA WHICH EXHIBIT THE PHENOMENA OF CIRCULAR POLARIZATION. [American Journal of Science, ser. 3, vol. xxm, pp. 460-476, June, 1882.] 1. In the April number of this Journal * the velocity of propagation of a system of plane waves of. light, regarded as oscillating electrical fluxes, was discussed with such a degree of approximation as would account for the dispersion of colors and give Fresnel's laws of double refraction. It is the object of this paper to supplement that discussion by carrying the approximation so much further as is necessary in order to embrace the phenomena of circularly polarizing media. 2. If we imagine all the velocities in any progressive system of plane waves to be reversed at a given instant without affecting the displacements, and the system of wave-motion thus obtained to be superposed upon the original system, we obtain a system of stationary waves having the same wave-length and period of oscillation as the original progressive system. If we then reduce the magnitude of the displacements in the uniform ratio of two to one, they will be identical, at an instant of maximum displacement, with those of the original system at the same instant. Following the same method as in the paper cited, let us especially consider the system of stationary waves, and divide the whole displacement into the regular part, represented by £ tj, f, and the irregular part, represented by rj'} in accordance with the definitions of § 2 of that paper. 3. The regular part of the displacement is subject to the equations of wave-motion, which may be written (in the most general case of plane stationary waves) a) * See page 182 of this volume.196 DOUBLE REFRACTION AND CIRCULAR POLARIZATION where l denotes the wave-length, p the period of oscillation, u the distance of the point considered from the wave-plane passing through the origin, ax, j3lt y1 the amplitudes of the displacements tj, f in the wave-plane passing through the origin, and a2, /32, y2 their amplitudes in a wave-plane one-quarter of a wave-length distant and on the side toward which u increases. If we also write L, M, N for the direction-cosines of the wave-normal drawn in the direction in which u increases, we shall have the following necessary relations: L2+M2-f N2 = l, (2) u = La;+M^+N0, (3) Lcq+MjSj+Nyj = 0, La2-fMi82+Ny2 = 0. (4) 4. That the irregular part of the displacement (£, tj', £') at any given point is a simple harmonic function of the time, having the same period and phase as the regular part of the displacement (£, tj, f), may be proved by the single principle of superposition of motions, and is therefore to be regarded as exact in a discussion of this kind. But the further conclusion of the preceding paper (§ 4), “that the values of ff, rj', £' at any given point in the medium are capable of expression as linear functions of £ rj, f in a manner which shall be independent of the time and of the orientation of the wave-planes and the distance of a nodal plane from the point considered, so long as the period of oscillation remains the same,” is evidently only approximative, although a very close approximation. A very much closer approximation may be obtained, if we regard ff, tj', f', at any given point of the medium and for light of a given period, as linear functions of tj, f and the nine differential coefficients didndldi dxf dx} dady’ We shall write £ tj, f and diff. eoeff. to denote these twelve quantities. From this it follows immediately that with the same degree of approximation tj', £' may be regarded, for a given point of the medium and light of a given period, as linear functions of ¿, rj, f and the differential coefficients of £, ij, f with respect to the coordinates. For these twelve quantities we shall write £ rj, £ and diff. eoeff. 5. Let us now proceed to equate the statical energy of the medium at an instant of no velocity with its kinetic energy at an instant of no displacement. It will be convenient to estimate each of these quantities for a unit of volume. 6. The statical energy of an infinitesimal element of volume may be represented by * dx ^ dx’ etc. (the unwritten expressions being obtained by substituting in the denominators dy and dz for dx), which constitutes the part of s/y that we have to consider. S„ is therefore a linear function of the space-averages of these nine quantities. But by (3) n d£ ?dn_T/ dx ^ dx V du ^ du/3 and the space-average of this, at a moment of maximum displacement, is % 0) 2ttLIN PERFECTLY TRANSPARENT MEDIA. 199 By such reductions it appears that £S„ is a linear function of the nine products of L, M, N with Ay2-yA. yia2-aiy2> «A-ft««- Now if we set 0 = L(/3xy2 - yip2) + M(yxa2 - ajy2) + N(a1((32 - ftoj), (7) we have by (4) and (2) L0 = /31y2 — y1^2i ^0 = yia2a\Y2» N0 = 0^/32 /3i«2* (8) Therefore is a linear function of the nine products of L, M, N with L0, M0, N0. That is, is the product of 0 and a quadratic function of L, M and N. We may therefore write S//=f © = j[L(Ay2 - vA) + M(yl(j2 - a^)+N (a,/% - Aag)], (9) where # is a quadratic function of L, M and N, dependent, however, on the nature of the medium and the period of oscillation. 9. It will be useful to consider more closely the geometrical significance of the quantity 0. For this purpose it will be convenient to have a definite understanding with respect to the relative position of the coordinate axes. We shall suppose that the axes of X, Y, and Z are related in the same way as lines drawn to the right, forward and upward, so that a rotation from X to Y appears clockwise to one looking in the direction of Z. Now if from any same point, as the origin of coordinates, we lay off lines representing in direction and magnitude the displacements in all the different wave-planes, we obtain an ellipse, which we may call the displacement-ellipse* Of this, one radius vector (px) will have the components av y1? and another (p2) the components a2> /32, y2* These will belong to conjugate diameters, each being parallel to the tangent at the extremity of the other. The area of the ellipse will therefore be equal to the parallelogram of which px and p2 are two sides, multiplied by ir. Now it is evident that yi«2*“air2> are numerically equal to the pro- jections of this parallelogram on the planes of the coordinate axes, and are each positive or negative according as a revolution from p1 to p2 appears clockwise or counter-clockwise to one looking in the direction of the proper coordinate axis. Hence, 0 will be numerically equal to the parallelogram, that is, to the area of the displacement-ellipse divided by ir, and will be positive or negative * This ellipse, which represents the simultaneous displacements in different parts of the field, will also represent the successive displacements at any same point in the corresponding system of progressive waves.200 DOUBLE REFRACTION AND CIRCULAR POLARIZATION according as a revolution from p1 to p2 appears clockwise or counterclockwise to one looking in the direction of the wave-normal. Since px and p2 are determined by displacements in planes one-quarter of a wave-length distant from each other, and the plane to which the latter relates lies on the side toward which the wave-normal is drawn, it follows that 0 is positive or negative according as the combination of displacements has the character of a right-handed or a left-handed screw. 10. The kinetic energy of the medium, which is to be estimated for an instant of no displacement, may be shown as in § 7 of the former paper (page 185 of this volume) to consist of two parts, of which one relates to the regular flux (£ fj, £), and the other to the irregular flux ((*', if, f"'). The first, in the notation of that paper, is represented by hf i + f Pot £)dv, which reduces to ¡¿fie+v+hdv. By substitution of the values given by equations (1), we obtain for the kinetic energy due to the regular flux in a unit of volume (10) 11. The kinetic energy of the irregular part of the flux is represented by the volume-integral fid' Pot i'+i' Pot n'+C Pot t) dv. Now, since rj', are everywhere linear functions of £, ij, f and diff. coeff. (see § 4>), and since the integrations implied in the notation Pot may be confined to a sphere of which the radius is small in comparison with a wave length,* and since within such a sphere £, V, £ and diff\ coeff. are sufficiently determined (in a linear form), by the values of the same twelve quantities at the center of the sphere, it follows that Pot Pot rj\ Pot £' must be linear functions of the values of £, rj, £ and diff. coeff. at the point for which the potential is sought. Hence, KfPotf + ^'PoM'-K'Poto will be a quadratic function of ij, f and diff. coeff. But the seventy-eight coefficients by which this function is expressed will vary with the position of the point considered with respect to the surrounding molecules. See § 9 of the former paper, on page 187 of this volume.IN PERFECTLY TRANSPARENT MEDIA. 201 Yet, as in the case of the statical energy, we may substitute the average values of these coefficients for the coefficients themselves in the integral by which we obtain the energy of any considerable space. The kinetic energy due to the irregular part of the flux is thus reduced to a quadratic function of £ rj, f and diff. coeff. which has constant coefficients for a given medium and light of a given period. The function may be divided into three parts, of which the first contains the squares and products of £ rj, f, the second the products of , rj, f with their differential coefficients, and the third, which may be neglected, the squares and products of the differential coefficients. We may proceed with the reduction precisely as in the case of the statical energy, except that the differentiations with respect to the 47T2 time will introduce the constant factor This will give for the first part of the kinetic energy of the irregular flux per unit of volume T', = 5 (A V+B'A2+CV+E'fty,+FyjOj + G'«A) F 92 + '^2'(^-a22 + B/322 + C'y22 + E'/32y2 -f F/y2a2 + G'a2/32), (11) and for the second part of the same T, 4?tWq T"=~pre 4'7T2<£/ = p2l —yA)+-^•(yia2 “ ai72) (aA ~ Aa2)] > (12) where A', B', C', E', F', G' are constant, and 4?' a quadratic function of L, M, and N, for a given medium and light of a given period. 12. Equating the statical and kinetic energies, we have &f + S„ = T + T,+V„, that is, by equations (6), (9), (10), (11), and (12), KA«i2 F/?i2 + Gyx2 + E/^yj + Fy^ + Gax^x) + i (Aa22 + B/322 + Cy22 + F/32y2+F y2a2 + Ga2/32) + J [L(Ay2 - 7 A) + M(y,a2 - aiy2) + N(a A - &a2)] = -ZJ- (ai+ &2 + 7i + a22 + A2 + 7i) ir 9 2 +(A'a*+B'A2+°Vi2+E'Ari+F'yi«i+G'«iA) 9 2 + ~Z& + F'/V + Cy22 + E'£2y2 + F'y2a2 + G'a2i82) F 4—2*' + [L(Ay2 - y A)+M (yi«2 - «iys)+N (aA - A««)]- <'13)202 DOUBLE REFRACTION AND CIRCULAR POLARIZATION (14) (15) If we set and A 2ttA/ p2 ’ = a = 2i" b 2tt~ 27r3>/ p3 27rB' p2 > etc., the equation reduces to aai + W+cri2+eAyi +/yi«i+,9«iA +aa22+b/322+cy 22+e/32y.2 +/y2a2+gaA + ^[L(Ay2-yA)+M(y1a2-a1y2)+N(a1/82-(81«2)] =p (“i2+A2+yi2+«22+A2+y22). (16) where a, 6, c, e, /, g are constant, and cj> a quadratic function of L, M, N, for a given medium and light of a given period. 13. Now this equation, which expresses a relation between the constants of the equations of wave-motion (1), will apply, with those equations, not only to such vibrations as actually take place, but also to such as we may imagine to take place under the influence of constraints determining the type of vibration. The free or unconstrained vibrations, with which alone we are concerned, are characterized by this, that infinitesimal variations (by constraint) of the type of vibration, that is, of the ratios of the quantities cq, y1? a2, /32, y2, will not affect the period by any quantity of the same order of magnitude* These variations must however be consistent with equations (4), which require that L dax + M dfii -f N dyl = 0, Lc£a2-}-M c£/32+Nc£y2 = 0. (17) Hence, to obtain the conditions which characterize free vibration, we may differentiate equation (16) with respect to cq, f}lt y1, a2, /32, y2, regarding all other letters as constant, and give to dav d/31, dylf da2, d/32, dy2i such values as are consistent with equations (17). Now dav d/31} dyx, are independent of da2, d/32, dy2, and for either three variations, values proportional either to cq, /3V yx, or to a2, /32, y2, are possible. If, then, we differentiate equation (16) with respect to cq, i81, y-p and substitute first cq; /¡}ly yl5 and then a2, /32, y2, for dalf d/31} cZy1? and also differentiate with respect to a2, fi2, y2, with similar substitutions, we shall obtain all the independent equations which this principle will yield. If we differentiate with respect to cq, /3lf y1? and write cq, f3x, y1 for dav d/3lf dylt we obtain aai + +cy!2 + efilVl +/y1a1 +gaA +^r [L(Ay2 - yA)+M(yi«2 - «iy2)+N (aA - A«2)l =|2(«12+A2+rl2)- (is) Compare § 11 of the former paper, page 189 of this volume.IN PERFECTLY TRANSPARENT MEDIA. 203 If we differentiate with respect to alf /3lf y19 and write a2, /32, 72 f°r dav d/31} dyx, we obtain 2aa1a2+26/3A+2cyiy2+e(/3iy2+y A)+/(7ia2+«172' O 72 +s,(«A+A« a)=^2 («i«2+Aft+ym)- (19) If we differentiate with respect to a2, /32> 72> and wr*te a2, /32, y2 for da2, dj32> dy2> we obtain aa22+6/322+cy22+e/32y2+/y2«2+^«2^2 IHAji - y A)+M (Vi«2 ~ “172)+N (“A ~ /V2)] =|(«22+/322+y22). (20) The equation derived by differentiating with respect to a2, /32, y2, and writing cq, fily yx for cZa2-, c£/32, cZy2, is identical with (19). We should also observe that equations (18) and (20) by addition give equation (16), which therefore will not need to be considered in addition to the last three equations. 14. The geometrical signification of our equations may now be simplified by a suitable choice of the position of the origin of coordinates, which is as yet wholly arbitrary. We shall hereafter suppose that the origin is placed in a plane of maximum or minimum displacement,* if such there are. In the case of circular polarization, in which the displacements are everywhere equal, its position is immaterial. The lines p1 and p2, of which cq, filf yl and a2, /32, y2 are respectively the components, will now be the semi-axes of >he displacement-ellipse, and therefore at right angles. (See § 9.) The case of circular polarization will not constitute any exception. Hence, aia2‘bi®u®2"b7i72== (21) and by § 9, 0 = L(fty2 - yJ32) + M (y^g - cqy2) + N (cq02 - fta2) = r'c p^p2, (22) where we are to read + or — in the last member according as the system of displacements has the character of a right-handed or a left-handed screw. 15. Equation (19) is now reduced to the form 2aa1a2+2Z>/3A+2cy1y2+c(fty2+y A) +/(yi«2+«m)+#(“ A+/W=(23) * The reader will perceive that an earlier limitation of the position of the origin by a supposition of this nature, involving a limitation of the values of cq, ft, 715 a2, ft, y2i would have been embarrassing in the operations of the last paragraph.204 DOUBLE REFRACTION AND CIRCULAR POLARIZATION which has a very simple geometrical signification. If we consider the ellipsoid ax2 _j_ ^2 _j_ cz2 _j_ eyZ _j_ jzx gXyt (24) and especially its central section by a plane parallel to the planes of the wave-system which we are considering, it will easily appear that the equation 2ax1x2+2by1yi+2cz1z2+e{yxz2+zyy2) +f(z 1^2+xi^)+9(x iV*+Vix2) = 0 will hold of any two .points xv yv % and x2, y2i z2 which belong to conjugate diameters of this central section. Therefore equation (23) expresses that the displacements a1} /31} yx and a2, /32, y2 are parallel to conjugate diameters of the central section of the ellipsoid (24) by a wave-plane. But since the displacements cq, /31? y1 and a2, /32, y2 are also at right angles to each other, it follows that they are parallel to the axes of the central section of the ellipsoid (24) by a wave-plane. That is:—The axes of the displacement-ellipse coincide in direction with those of a central section of the ellipsoid (24) by a wave-plane. 16. If we write U1; U2 for the reciprocals of the semi-axes of the central section of the ellipsoid (24) by a wave-plane, being the reciprocal of the one to which the displacement cq, j3v yx is parallel, we have aa\ + w+cy,2+efty! +fy1a1 +gax^ = U^a,2++yx2), (25) as is at once evident if we substitute the coordinates of an extremity of the axis for the proportional quantities cq, f3v yv So also aa22+&/322+cy22+e/32y2 +/y2a2+ga^=U22 (a22+ft2+y2* )• (26) If we write V for the velocity of propagation of the system of progressive waves corresponding to the system of stationary waves which we have been considering, we shall have V = i (27) *> By equations (22), (25), and (26), equations (18) and (20) are reduced to the form />1/“2 = VV> u2v22±| PlP2 = VV, (28) where we are to read -f- or — according as the disturbance has the character of a right-handed or a left-handed screw. In a progressive system of waves, when the combination of displacements has the character of a right-handed screw, the rotations will be such as appear cloclcwise to the observer, who looks in the direction opposite to that of the propagation of light. We shall call such a ray right-handed. We may here observe that in case (p — 0 the solution of theseIN PERFECTLY TRANSPARENT MEDIA. 205 equations is very simple. We have necessarily either p2 = 0 and V2 = U12, or Pi = 0 and V2 = U22. In this case, the light is linearly polarized, and the directions of oscillation and the velocities of propagation are given by Fresnel’s law. Experiment has shown that this is the usual case. We wish, however, to investigate the case in which

    is always very small in comparison with Y3, U13, or U23. 17. Equations (28) may be written V2-U V2~U22=±|&. (29) 1 Y Pi 2 Y P2 By multiplication we obtain Y2(Y2 - U12)(Y2 - U22) = 2. (30) Since (f> is a very small quantity, it is evident from inspection of this equation that it will admit three values of Y2, of which one will be a very little greater than the greater of the two quantities U12 and U22, another will be a very little less than the less of the same two quantities, and the third will be a very small quantity. It is evident that the values of Y2 with which we have to do are those which differ but little from U12 and U22.* For the numerical computation of Y, when Uj, U2, and are known numerically, we may divide the equation by Y2, and then solve it as if the second member were known. This will give By substituting UjUg for Y2 in the second member, we may obtain a close approximation to the two values of Y2. Each of the values obtained may be improved by substitution of that value for Y2 in the second member of the equation. For either value of Y2, we may easily find the ratio of px to p2, that is, the ratio of the axes of the displacement-ellipse, from one of equations (29), or from the equation Y2 —U,2 ££-2 “ Pi obtained by combining the two. "Y2- • TJ 2 (32) * We should not attribute any physical significance to the third value of V2. For this value would imply a wave-length very small in comparison with the length of ordinary waves of light, and with respect to which our fundamental assumption that the wave-length is very great in comparison with the distances of contiguous molecules would be entirely false. Our analysis, therefore, furnishes no reason for supposing that any such velocities are possible for the propagation of electrical disturbances.206 DOUBLE REFRACTION AND CIRCULAR POLARIZATION In equations (29), we are to read + or — in the second members, according as the ray is right-handed or left-handed. (See §16.) It follows that if the value of is positive, the greater velocity will belong to a right-handed ray, and the smaller to a left-handed, but if the value of is negative, the opposite is the case. Except when — 0, and the polarization is linear, there will be one right-handed and one left-handed ray for any given wave-normal and period. 18. When U1 = U2i equations (29) give ft “a. v*=rPd4, where U represents the common value of \J1 and U2. The polarization is therefore circular. The converse is also evident from equations (29), viz., that a ray can be circularly polarized only when the direction of its wave-normal is such that U^Ug. Such a direction, which is determined by a circular section of the ellipsoid (24) precisely as an optic axis of a crystal which conforms to Fresnel’s law of double refraction, may be called an optic axis, although its physical properties are not the same as in the more ordinary case.* If we write VR and VL, respectively, for the wave-velocities of the right-handed and left-handed rays, we have Vb2 = U2+w~, Vl2=U2-^; V R VL (33) whence y 2 y 2 ./ 1 | 1 \_ i VR+VL Vr Vl-^Yr+VJ-^ VRVL ’ and V —V — ^ Vr Vl-VeVl- (34) The phenomenon best observed with respect to an optic axis is the rotation of the plane of linearly polarized light. If we denote by 6 the amount of this rotation per unit of the distance traversed by the wave-plane, regarding it as positive when it appears clockwise to the * Our experimental knowledge of circularly or elliptically polarizing media is confined to such as are optically either isotropic or uniaxial. The general theory of such media, embracing the case of two optic axes, has however been discussed by Professor von Lang (‘‘ Theorie der Circularpolarization,” Sitz.-Ber. Wiener Akad., vol. lxxv, p. 719). The general results of the present paper, although derived from physical hypotheses of an entirely different nature, are quite similar to those of the memoir cited. They would become identical, the writer believes, by the substitution of a constant for ~ or ^ in the equations of this paper. (See especially equations (18), (20), (28).) That a complete discussion of the subject on any theory must include the case of biaxial media having the property of circular or elliptical polarization, is evident from the consideration that it must at least be possible to produce examples of such media artificially. An isotropic or uniaxial crystal may be made biaxial by pressure. If it has the property of circular and elliptic polarization, that property cannot be wholly destroyed by the application of small pressures.IN PERFECTLY TEANSPARENT MEDIA. 207 observer, who looks in the direction opposite to that of the propagation of the light,* we have 19. Since these equations involve unknown functions of the period they will not serve for an exact determination of the relation between 0 and the period. For a rough approximation, however, we may assume that the manner in which the general displacement in any small part of the medium distributes itself among the molecules and intermolecular spaces is independent of the period, being determined entirely by the values of £ rj, £ and their differential coefficients with respect to the coordinates, f For a fixed direction of the wave-normal, 4? and will then be constant. Now equations (15) and (36) give 0 = $ 2ttW 2p2VR2VL2 p*yK*yL (38) To express this result in terms of the quantities directly observed, we may use the equations A A VL = A u= k where k denotes the velocity of light in vacuo, A the wave-length in vacuo of the light employed, nR, nh the absolute indices of refraction of the two rays, and n the index for the optic axis as derived from the ellipsoid (24) by Fresnel’s law. We thus obtain 0=: .. 2/v) 2 *R nL 2&2X2 27T23?'«jl2'ftL2 X* (39) * When the rotation of the plane of polarization appears clockwise to the observer, it has the character of a left-handed screw. But the circularly polarized ray to which Ye relates, the rotation of which also appears clockwise to the observer, has the character of a right-handed screw. t The degree of accuracy of this substitution may be shown as follows. By (33) Yr (YR2 - U2) = VL (U2 - Vl2), whence Yr3 + V^= (V* + V*) U2, Vk2-YrYl + Vl2=U2, YrYl=U2 - (VB - Vi)2. X Compare § 12 of the former paper, on page 189 of this volume.208 DOUBLE REFRACTION AND CIRCULAR POLARIZATION In the case of uniaxial crystals, the direction of the optic axis is fixed. We may therefore write 0 = 'V%,2(J+5). (40) regarding E and K' as constants. If we had used equation (87), we should have had the factor n* instead of nn2nh2. Since this factor varies hut slowly with X, it may he neglected, if its omission is compensated in the values of K and K'. The formula being only approximative, such a simplification will not necessarily render it less accurate. 20. But without any such assumption as that contained in the last paragraph, we may easily obtain formulae for the experimental determination of 4? and for the optic axis of a uniaxial crystal. Considerations analogous to those of § 13 of the former paper (page 190 of this volume), show that in differentiating equation (39) we may regard 4? and 4?' as constant, although they may actually vary with X. This equation may be written Therefore, OX2 _ 4? 2ttW n4 2h2 X2 ‘ (41) (42) When 4?' has been determined by this equation, 4? may be found from the preceding. 21. If we wish to represent geometrically, like Uj and U2, we may construct the surfaces Ax2 By2+cz24-E yz+f zx -f- Gxy = ± 1, (43) the coefficients A, B, etc., being the same by which

    , for any direction of the wave-normal, will thus be represented by the square of the reciprocal of the radius vector of the surface drawn in the same direction. The positive or negative character of must be separately indicated. There are here two cases to be distinguished. If the sign of is the same in all directions, the surface will be an ellipsoid, and we have only to know whether all the values of

    is positive for some directions and negative for others, the surface will consist of two conjugate hyperboloids, to one of which the positive, and to the other the negative values belong.IN PERFECTLY TRANSPARENT MEDIA. 209 22. The manner in which the ellipsoid (24) may be partially determined by the relations of symmetry which the medium may possess, has been sufficiently discussed in the former paper. With respect to the quantity , and the surfaces which determine it, the following principle is of fundamental importance. If one body is identical in its internal structure with the image by reflection of another, the values of

    (if not zero for all directions) must be positive for some directions and negative for others. Moreover, the above described surface by which is represented must consist of two conjugate hyperboloids, of which one is identical in form with the image by reflection of the other. This requires that the hyperboloids shall be right cylinders with conjugate rectangular hyperbolas for bases. A crystal characterized by such properties will belong to the tetragonal system. Since (j>==0 for the optic axis, it would be difficult to distinguish a case of this kind from an ordinary uniaxial crystal, unless the ellipsoid (24) should approach very closely to a sphere.! It is only in the very limited case described in the last paragraph that a medium which is identical in its internal structure with its image by reflection can have the property of circular or elliptic polarization. To media which are unlike their images by reflection, and have the property of circular polarization, we may apply the following general principles. If the medium has any axis of symmetry, the ellipsoid or hyperboloids which represent the values of will have an axis in the same direction. If the medium after a revolution of less than 180° about any axis is equivalent to the medium in its first position, the ellipsoid or hyperboloids will have an axis of revolution in that direction. 23. The laws of the propagation of light in plane waves, which * The necessity of the opposite signs will perhaps appear most readily from the consideration that the direction of rotation of the plane of polarization must be opposite in the two bodies. t There is no difficulty in conceiving of the constitution of a body which would have the properties described above. Thus* we may imagine a body with molecules of a spiral form, of which one-half are right-handed and one-half left-handed, and we may suppose that the motion of electricity is opposed by a less resistance within them than without. If the axes of the right-handed molecules are parallel to the axis of X, and those of the left-handed molecules to the axis of Y, their effects would counterbalance one another when the wave-normal is parallel to the axis of Z. But when the wave-normal (of a beam of linearly polarized light) is parallel to the axis of X, the left-handed molecules would produce a left-handed (negative) rotation of the plane of polarization, the right-handed molecules having no effect; and when the wave-normal is parallel to the axis of Y, the reverse would be the ease.210 DOUBLE REFRACTION, ETC. have thus been derived from the single hypothesis that the disturbance by which light is transmitted consists of solenoidal electrical fluxes, and which apply to light of different colors and to the most general case of perfectly transparent and sensibly homogeneous media not subject to magnetic action,* are essentially those which are generally received as embodying the results of experiment. In no particular, so far as the writer is aware, do they conflict with the results of experiment, or require the aid of auxiliary and forced hypotheses to bring them into harmony therewith. In this respect, the electromagnetic theory of light stands in marked contrast with that theory in which the properties of an elastic solid are attributed to the ether,—a contrast which was very distinct in Maxwell’s derivation of Fresnel’s laws from electrical principles, but becomes more striking as we follow the subject farther into its details, and take account of the want of absolute homogeneity in the medium, so as to embrace the phenomena of the dispersion of colors and circular and elliptical polarization. * The rotation of the plane of polarization which is produced by magnetic action has been discussed by Maxwell (Treatise on Electricity and Magnetism, vol. ii, chap, xxi), and by Rowland {Amer. Journ. Math., vol. iii, p. 107).XIII. ON THE GENERAL EQUATIONS OF MONOCHROMATIC LIGHT IN MEDIA OF EVERY DEGREE OF TRANSPARENCY. [American Journal of Science, ser 3, vol. xxv, pp. 107-118. February, 1883.] 1. The last April and June numbers of this Journal* contain an investigation of the velocity of plane waves of light, in which they are regarded as consisting of solenoidal electrical fluxes in an indefinitely extended medium of uniform and very fine-grained structure. It was also supposed that the medium was perfectly transparent, although without discussion of the physical properties on which transparency depends, and that the electrical motions were not complicated by any distinctively magnetic phenomena. In the present paper t the subject will be treated with more generality, so as to obtain the general equations of monochromatic light for media of every degree of transparency, whether sensibly homogeneous or otherwise, which have a very fine-grained molecular structure as measured by a wave-length of light. There will be no restriction with respect to magnetic influence, except that an oscillating magnetization of the medium will be excluded. J In order to conform as much as possible to the ordinary view of * See pages 182-194 and 195-210 of this volume. t This paper contains, with some additional developments, the substance of a communication to the National Academy of Sciences in November, 1882. X Where a body capable of magnetization is subjected to the influence of light (as when light is reflected from the surface of iron), there are two simple hypotheses which present themselves with respect to the magnetic state of the body. One is that the magnetic forces due to the light are not of sufficient duration to allow the molecular changes which constitute magnetization to take place to any sensible extent. The other is that the magnetization has a constant ratio to the magnetic force without regard to its duration. We might easily make a more general hypothesis which would embrace both of those mentioned as extreme cases, and which would be irreproachable from a theoretical stand-point; but it would complicate our equations to a degree which would not be compensated by their greater generality, since no phenomena depending on such magnetization have been observed, so far as the writer is aware, or are likely to be, except in a very limited class of cases. For the purposes of this paper, therefore, it has seemed better to exclude media capable of magnetization, except so far as the first mentioned hypothesis may be applicable. But it does not appear that this requires us to exclude cases in which the medium is subject to the influence of a permanent magnetic force, such as produces the phenomenon of the magnetic rotation of the plane of polarization.212 EQUATIONS OF MONOCHROMATIC LIGHT IN MEDIA electrical phenomena,* we shall not introduce at first the hypothesis of Maxwell that electrical fluxes are solenoidal.t Our results, however, will he such as to require us to admit the substantial truth of this hypothesis, if we regard the processes involved in the transmission of light as electrical. With regard to the undetermined questions of electrodynamic induction, we shall adopt provisionally that hypothesis which appears the most simple, yet proceed in such a manner that it will be evident exactly how our results must be altered, if we prefer any other hypothesis. Electrical quantities will be treated as measured in electromagnetic units. 2. We must distinguish, as before, between the actual electrical displacements, which are too complicated to follow in detail with analysis, and which in their minutiae elude experimental demonstration, and the displacements as averaged for spaces which are large enough to smooth out their minor irregularities, but not so large as to obliterate to any sensible extent those more regular features of the electrical motion, which form the subject of optical experiment. These spaces must therefore be large as measured by the least distances between molecules, but small as measured by a wave-length of light. We shall also have occasion to consider similar averages for other quantities, as electromotive force, the electrostatic potential, etc. It will be convenient to suppose that the space^for which the average is taken is the same in all parts of the field,i say a sphere of uniform radius having its center at the point considered. Whatever may be the quantities considered, such averages will be represented by the notation [ jAve* * It has, perhaps, retarded the acceptance of the electromagnetic theory of light that it was presented in connection with a theory of electrical action, which is probably more difficult to prove or disprove, and certainly presents more difficulties of comprehension, than the connection of optical and electrical phenomena, and which, as resting largely on a priori considerations, must naturally appear very differently to different minds. Moreover, the mathematical method by which the subject was treated, while it will remain a striking monument of its author’s originality of thought, and profoundly modify the development of mathematical physics, must nevertheless, by its wide departure from ordinary methods, have tended to repel such as might not make it a matter of serious study. + A flux is said to be solenoidal when it satisfies the conditions which characterize the motion of an incompressible fluid,—in other words, if u, v, w are the rectangular components of the flux, when du dv dw_ dx dy dz~ ’ and the normal component of the flux is the same on both sides of any surfaces of discontinuity which may exist. I This is rather to fix our ideas, than on account of any mathematical necessity. For the space for which the average is taken may in general be considerably varied without sensibly affecting the value of the average.OF EVERY DEGREE OF TRANSPARENCY. 213 If, then, £ r\, f denote the components of the actual displacement at the point considered, [£Uve> \jl\ Ave> C^Ave will represent the average values of these components in the small sphere about that point. These average values we shall treat as functions of the coordinates of the center of the sphere and of the time, and may call them, for brevity, the average values of £ rj, £ But however they may be designated, it is essential to remember that it is a space-average for a certain very small space, and never a time-average, that is intended. The object of this paper will be accomplished when we have expressed (explicitly or implicitly) the relations which subsist between the values of [£]Ave, [rf]Aye, [£]Ave, at different times and in different parts of the field,—in other words, when we have found the conditions which these quantities must satisfy as functions of the time and the coordinates. 3. Let us suppose that luminous vibrations of any one period * are somewhere excited, and that the disturbance is propagated through the medium. The motions which are excited in any part of the medium, and the forces by which they are kept up, will be expressed by harmonic functions of the time, having the same period,! as may be proved by the single principle of the superposition of motions quite independently of any theory of the constitution of the medium, or of the nature of the motions, as electrical or otherwise. This is equally true of the actual motions, and of the averages which we are to consider. We may therefore set [#]A™ = aiCOSyi+a2sm— t, I etc., * There is no real loss of generality in making the light monochromatic, since in every case it may be divided into parts, which are separately propagated, and each of which is monochromatic to any required degree of approximation. t It is of course possible that the expressions for the forces and displacements should have constant terms. But these will disappear, if the displacements are measured from the state* of equilibrium about which the system vibrates, and we leave out of account in measuring the forces (and the electrostatic potential) that which would belong to the system in the state of equilibrium. To prevent misapprehension, it should be added that the term electrical displacement is not used in the restricted sense of dielectric displacement or polarization. The variation of the electrical displacement, as the term is used in this paper, constitutes what Maxwell calls the total motion of electricity or true current, and what he divides into two parts, which he distinguishes as the current of conduction and the variation of the electrical displacement. Such a division of the total motion of electricity is not necessary for the purposes of this paper, and the term displacement is used with reference to the total motion of electricity in a manner entirely analogous to that in which the term is ordinarily used in the theory of wave-motion.214 EQUATIONS OF MONOCHKOMATIC LIGHT IN MEDIA where t denotes the time, p the period, and av a2, functions of the coordinates. It follows that [i']Ave=-|fe]AJ (2) etc. J 4. Now, on the electrical theory, these motions are excited by electrical forces, which are of two kinds, distinguished as electrostatic and electrodynamic. The electrostatic force is determined by the electrostatic potential. If we write q for the actual value of the potential, and [g]Ave for its value as averaged in the manner specified above, the components of the actual electrostatic force will be dq dq dq m dx* dy ’ dz ’ and for the average values of these components in the small spaces described above we may write ^[ffjAve ^[^Ave ^ ]Ave dx ’ dy 9 dz 1 for it will make no difference whether we take the average before or after differentiation. 5. The electrodynamic force is determined by the acceleration of electrical flux in all parts of the field, but physicists are not entirely agreed in regard to the laws by which it is determined. This difference of opinion is however of less importance, since it will not affect the result if electrical fluxes are always solenoidal. According to the most simple law, the components of the force are given by the volume-integrals where dv represents an element of volume, and r the distance of this element from the point for which the value of the electromotive force is to be determined. In other words, the components of the force at any point are determined from the components of acceleration in all parts of the field by the same process by which (in the theories of gravitation, etc.) the value of the potential at any point is determined from the density of matter in all parts of space, except that the sign is to be reversed. Adopting this law, provisionally at least, we may express it by saying that the components of electrodynamic force are equal to the potentials taken negatively of the components of acceleration of electrical flux. And we may write, for brevity, — Pot — Pot rj, — Pot £, for the components of force, using the symbol Pot to denote the operation by which the potential of a mass is derived from itsOF EVERY DEGREE OF TRANSPARENCY. 215 density. For the average values of these components in the small spaces defined above, we may write — P°t[i-]Ave, —Pot [^]Ave> — Pot [£]Ave> since it will make no difference whether we take the average before or after the operation of taking the potential. 6. If we write X, Y, Z for the components of the total electromotive force (electrostatic and electrodynamic), we have [X]Ave=-Pot[£ATe-^J^ etc., or by (2) [X]Ave = ^r2Pot[fl etc. It will be convenient to represent these relations by a vector notation. If we represent the displacement by U, and the electromotive force by E, the three equations of (3) will be represented by the single vector equation [E]Ave = “P°t [li]Ave~ V[#]Ave> (5) and the three equations of (4) by the single vector equation [E]Ave = ^Pot[U]ATe-V[g]Ave) (6) where, in accordance with quaternionic usage, V[q]Ave represents the vector which has for components the derivatives of [^]Ave with respect to rectangular coordinates. The symbol Pot in such a vector equation signifies that the operation which is denoted by this symbol in a scalar equation is to be performed upon each of the components of the vector. 7. We may here observe that if we are not satisfied with the law adopted for the determination of electrodynamic force we have only to substitute for —Pot in these vector equations, and in those which follow, the symbol for the operation, whatever it may be, by which we calculate the electrodynamic force from the acceleration* For the operation must be of such a character that if the acceleration consist of any number of parts, the force due to the whole acceleration will be the resultant of the forces due to the separate parts. It will evidently make no difference whether we take an average before or after such an operation. * The same would not be true of the corresponding spalar equations, (3) and (4), For one component of the force might depend upon all the components of acceleration. Such is in fact the case with the law of electromotive force proposed by Weber. d[q]A dx (4)216 EQUATIONS OF MONOCHROMATIC LIGHT IN MEDIA 8. Let us now examine the relation which subsists between the values of [E]Ave and [U]Ave for the same point, that is, between the average electromotive force and the average displacement in a small sphere with its center at the point considered. We have already seen that the forces and the displacements are harmonic functions of the time having a common period. A little consideration will show that if the average electromotive force in the sphere is given as a function of the time, the displacements in the sphere, both average and actual, must be entirely determined. Especially will this be evident, if we consider that since we have made the radius of the sphere very small in comparison with a wave-length, the average force must have sensibly the same value throughout the sphere (that is, if we vary the position of the center of the sphere for which the average is taken by' a distance not greater than the radius, the value of the average will not be sensibly affected), and that the difference of the actual and average force at any point is entirely determined by the motions in the immediate vicinity of that point. If, then, certain oscillatory motions may be kept up in the sphere under the influence of electrostatic and electrodynamic forces due to the motion in the whole field, and if we suppose the motions in and very near that sphere to be unchanged, but the motions in the remoter parts of the field to be altered, only not so as to affect the average resultant of electromotive force in the sphere, the actual resultant of electromotive force will also be unchanged throughout the sphere, and therefore the motions in the sphere will still be such as correspond to the forces. Now the average displacement is a harmonic function of the time having a period which we suppose given. It is therefore entirely determined for the whole time the vibrations continue by the values of the six quantities [^AveJ t^Ave; [£]Ave> [^Ave? [^Javo [^]Ave at any one instant. For the same reason tUe average electromotive force is entirely determined for the whole time by the values of the six quantities [X]Ave> [Y]Ave, [Z]Ave, [X]Ave, [Y]Ave, [Z]Ave, for the same instant. The first six quantities will therefore be functions of the second, and the principle of the superposition of motions requires that they shall be homogeneous functions of the first degree. And the second six quantities will be homogeneous functions of the first degree of the first six. The coefficients by which these functions are expressed will depend upon the nature of the medium in the vicinity of the point considered. They will alsoOF EVERY DEGREE OF TRANSPARENCY. 217 depend upon the period of vibration, that is, upon wie color of the light.* We may therefore write in vector notation [E]Ave = ^[U]Ave + ^[U]Aye (7) where 3? and ^ denote linear functions.! The optical properties of media are determined by the form of these functions. But all forms of linear functions would not be consistent with the principle of the conservation of energy. In media which are more or less opaque, and which therefore absorb energy, 'SEr must be of such a form that the function always makes an acute angle (or none) with the independent variable. In perfectly transparent media, ^ must vanish, unless the function is at right angles to thé independent variable. So far as is known, the last occurs only when the medium is subject to magnetic influence. In perfectly transparent media, the principle of the conservation of energy requires that 3? should be self-conjugate, i.e., that for three directions at right angles to one another, the function and independent variable should coincide in direction. In all isotropic media not subject to magnetic influence, it is probable that $ and 'Tr reduce to numerical coefficients, as is certainly the case with 3? for transparent isotropic media. 9. Comparing the two values of [E]Aye, we have Pot[U]Ave—V[?]Ave = $[U]ATe+*[U]Ave. (8) This equation, in connection with that by which we express the sole-noidal character of the displacements, if we regard them as necessarily solenoidal, or in connection with that which expresses the relation between the electrostatic potential and the displacements, if we reject the solenoidal hypothesis, may be regarded as the general equation of the vibrations of monochromatic light, considered as oscillating electrical fluxes. For the symbol Pot, however, we must substitute the symbol representing the operation by which electromotive force is calculated from acceleration of flux, with the negative sign, if we are not satisfied with the law provisionally adopted. * The relations between the displacements in one of the small spaces considered and the average electromotive force is mathematically analogous to the relation between the displacements in a system of a high degree of complexity and certain forces exerted from without, which are harmonic functions of the time and under the influence of which the system vibrates. The ratio of the displacements to the forces will in general vary with the period, and may vary very rapidly. An example in which these functions vary very rapidly with the period is afforded by the phenomena of selective absorption and abnormal dispersion. t A vector is said to be a linear function of another, when the three components of the first are homogeneous functions of the first degree of the three components of the second.218 EQUATIONS OF MONOCHROMATIC LIGHT IN MEDIA It is important to observe that the existence of molecular vibrations of ponderable matter, due to the passage of light through the medium, will not affect the reasoning by which this equation has been established, provided that the nature and intensity of these vibrations in any small part of the medium (as measured by a wave-length) are entirely determined by the electrical forces and motions in that part of the medium. But the equation would not hold in case of molecular vibrations due to magnetic force. Such vibrations would constitute an oscillating magnetization of the medium, which has already been excluded from the discussion. The supposition which has sometimes been made,* that electricity possesses a certain mass or inertia, would not at all affect the validity of the equation. 10. The equation may be reduced to a form in some respects more simple by the use of the so-called imaginary quantities. We shall write l for *y(—1). If we differentiate with respect to the time, and 4tt2 substitute —-^"[HjAve for [U]Ave, we obtain ^ Pot [U]ATe- V[g]Are = $[tr]Ave- ^ *[U]Ave. If we multiply this equation by i, either alone or in connection with any real factor, and add it to the preceding, we shall obtain an equation which will be equivalent to the two of which it is formed. Multiplying by and adding, we have Pot -. £ p].,.)- v ([?]« -1 £ r «a „,) If we set rTTl p r • n W = [U]Ave-^[U]Ave, (9) Q ~ [^live £ [ÎlAveî (10) p our equation reduces to (H) i^Potw-VQ=ew. p2 (12) In this equation 0 denotes a complex linear vector function, i.e., a vector function of which the X-, Y-, and Z-components are expressed in terms of the X-, Y-, and Z-components of the independent variable by means of coefficients of the form a -f- ib. W is a bivector of which * See Weber, Abhandl. d\ K. Sachs. Oesellsch. d. Wiss.f vol. vi, pp, 593-597 ; Lorberg, Grdie’s Journal, vol. lxi, p. 55.OF EVERY DEGREE OF TRANSPARENCY. 219 the real part represents the averaged displacement [U]Ave, and the coefficient of i the rate of increase of the same multiplied by a constant factor. This bivector therefore represents the average state of a small part of the field both with respect to position and velocity. We may also say that the coefficient of i in W represents the value of the averaged displacement [U]Aye at a time one-quarter of a vibration earlier than the time principally considered. 11. It may serve to fix our ideas to see how W is expressed as a function of the time. We may evidently set [U]Ave = Ai cos ~t+A2 sin yt where A1 and A2 are vectors representing the amplitudes of the two parts into which the vibration is resolved. Then p r*. A • *27r, . 2tt £^IV]a™ = — Ax sm —i+A2 cos — t, and IXUve - ' [VUv. = (A, - which will be equivalent to (12), if w=(^-C)(?Pofcw-vQ> <17> where K and C denote in the most general case the linear vector functions, but in isotropic bodies the numerical coefficients, which represent inductive capacity and conductivity. By a simple transformation {see (9) and (10)}, this equation becomes 0-i=k_4£O 4tt 27r (18) where 0_1 represents the function inverse to 0. Now, while experiment appears to verify the existence of such a law as is expressed by equation (12), it does not show that 0 has the precise form indicated by equation (16). In other words, experiment does not satisfactorily verify the relations expressed by (16) and (17), if K and C are understood to be the operators (or, in isotropic bodies, the numbers) which represent inductive capacity and conductivity in the ordinary sense of the terms. The discrepancy is most easily shown in the most simple case, when the medium is isotropic and perfectly transparent, and 0 reduces to a numerical quantity. The square of the velocity of plane waves is 0 then equal to and equation (18) would make it independent of the * Phil. Trans., vol. civ (1865), p. 459, or Treatise on Electricity and Magnetism, chap. xx. t Schlomilch’s Zeitschrift, vol. xxii, pp. 1-30 and 205-219; xxiii, pp. 197-210. :£See Fitzgerald, Phil. Trans., vol. clxxi, p. 691; J. J. Thomson, Phil. Mag., (5), vol. ix, p. 284; Rayleigh, Phil. Mag. (5), vol. xii, p. 81. That the electromagnetic theory of light gives the conditions relative to the boundary of different media,'which are required by the phenomena of reflection and refraction, was first shown by Helmholtz. See Crelle’s Journal, vol. Ixxii (1870), p. 57.OF EVERY DEGREE OF TRANSPARENCY. 221 period; that is, would give no dispersion of colors. The case is essentially the same in transparent bodies which are not isotropic.* The case is worse with metals, which are characterized electrically by great conductivity, and optically by great opacity. In their papers cited above, Lorentz and Rayleigh have observed that the experiments of Jamin on the reflection of light from metallic surfaces would often require, as ordinarily interpreted on the electromagnetic theory, a negative value for the inductive capacity of the metal. This would imply that the electrical equilibrium in the metal is unstable. The objection, therefore, is essentially the same as that which Lord Rayleigh had previously made to Cauchy’s theory of metallic reflection, viz., that the apparent mechanical explanation of the phenomena is illusory, since the numerical values given by experiment as interpreted on Cauchy’s theory would involve an unstable equilibrium of the ether in the metal, t 13. All this points to the same conclusion—that the ordinary view of the phenomena is inadequate. The object of this paper will be accomplished, if it has been made clear how a point of view more in accordance with what we know of the molecular constitution of bodies will give that part of the ordinary theory which is verified by experiment, without including that part which is in opposition to observed facts. J * See note to the first paper of Lorentz, cited above, Schlomilch, vol. xxii, p. 23. t See Phil. Mag. (4), vol. xliii (1872), p. 321. £The consideration of the processes which we may suppose to take place in the smallest parts of a body through which light is transmitted, farther than is necessary to establish the general equation given above, is foreign to the design of this paper. Yet a word may be added with respect to the difficulties signalized in the ordinary form of the theory. The comparatively simple case of a perfectly transparent body has been examined more in detail in one of the papers already cited, where there is given an explanation of the dispersion of colors from the point of view of this paper. It is there shown that the effect of the non-homogeneity of the body in its smallest parts is to add a term to the expression for the kinetic energy of electrical waves, which for an isotropic body may be roughly described as similar to that which would be required if the electricity had a certain mass or inertia. (See especially §§ 7, 9 and 12, [this volume pages 185 ff.]) The same must be true of media of any degree of opacity. Now the difficulty with the optical properties of the metals is that the real part of 0 (or 0_1) is in some cases negative. This implies that at a moment of greatest displacement the electromotive force is in the direction opposite to the displacement, instead of having the same direction, as in transparent isotropic bodies. Now a certain part of the electromotive force must be required to oppose the apparent inertia, and another part to oppose the electrical elasticity of the medium. These parts of the force must have opposite directions. In transparent bodies the latter part is by far the greater. But it need not surprise us that the former should be the greater in some metals. It has been remarked by Lorentz that the difficulty with respe.ct to metals would be in a measure relieved if we should suppose electricity to have the property of inertia. (See § 11 of his third paper, Schlomilch’s Zeitschrift, vol. xxiii, p. 208.) But a supposition of this kind, taken literally, would involve a dispersion of colors in vacuo, and still be inadequate, as Lorentz remarks, to explain the phenomena observed in metals.222 EQUATIONS OF MONOCHROMATIC LIGHT, ETC. While the writer has aimed at a greater degree of rigor than is usual in the establishment of the fundamental equation of monochromatic light, it is not claimed that this equation is absolutely exact. The contrary is evident from the fact that the equation does not embrace the phenomena which characterize such circularly polarizing bodies as quartz. This, however, only implies the neglect of extremely small quantities—-very small, for example, as compared with those which determine the dispersion of colors. In one of the papers already cited;* the case of a perfectly transparent body is treated with a higher degree of approximation, so as to embrace the phenomena in question. * See page 195 of this volume.XIV. A COMPARISON OF THE ELASTIC AND THE ELECTRICAL THEORIES OF LIGHT WITH RESPECT TO THE LAW OF DOUBLE REFRACTION AND THE DISPERSION OF COLORS. [American Journal of Science, ser. 3, vol. xxxv, pp. 467-475, June, 1888.] It is claimed for the electrical * theory of light that it is free from serious difficulties, which beset the explanation of the phenomena of light by the dynamics of elastic solids. Just what these difficulties are, and why they do not occur in the explanation of the same phenomena by the dynamics of electricity, has not perhaps been shown with all the simplicity and generality which might be desired. Such a treatment of the subject is however the more necessary on account of the ever-increasing bulk of the literature on either side, and the confusing multiplicity of the elastic theories. It is the object of this paper to supply this want, so far as respects the propagation of plane waves in transparent and sensibly homogeneous media. ' The simplicity of this part of the subject renders it appropriate for the first test of any optical theory, while the precision of which the experimental determinations are capable, renders the test extremely rigorous. It is moreover, as the writer believes, an appropriate time for the discussion proposed, since on one hand the experimental verification of Fresnel’s Law has recently been carried to a degree of precision far exceeding anything which we have had before,! and on the other, the * The term electrical seems the most simple and appropriate to describe that theory of light which makes it consist in electrical motions. The cases in which any distinctively magnetic action is involved in the phenomena of light are so exceptional, that it is difficult to see any sufficient reason why the general theory should be called electromagneticunless we are to call all phenomena electromagnetic which depend on the motions of electricity. f In the recent experiments of Professor Hastings relating to the index of refraction of the extraordinary ray in Iceland spar for the spectral line D2 and a wave-normal inclined at about 31° to the optic axis, the difference between the observed and the calculated values was only two or three units in the sixth decimal place (in the seventh significant figure), which was about the probable error of the determinations. See Am. Jour. Sci. ser. 3, vol. xxxv, p. 60.224 ELASTIC AND ELECTRICAL THEORIES OF LIGHT. discovery of a remarkable theorem relating to the vibrations of a strained solid * has given a new impulse to the study of the elastic theory of light. Let us first consider the facts to which a correct theory must conform. It is generally admitted that the phenomena of light consist in motions (of the type which we call wave-motions) of something which exists both in space void of ponderable matter, and in the spaces between the molecules of bodies, perhaps also in the molecules themselves. The kinematics of these motions is pretty well understood; the question at issue is whether it agrees with the dynamics of elastic solids or with the dynamics of electricity. In the case of a simple harmonic wave-motion, which alone we need consider, the wave-velocity (Y) is the quotient of the wave-length (l) by the period of vibration {$>). These quantities can be determined with extreme accuracy. In media which are sensibly homogeneous but not isotropic the wave-velocity Y, for any constant value of the period, is a quadratic function of the direction cosines of a certain line, viz., the normal to the so-called “ plane of polarization.” The physical characteristics of this line have been a matter of dispute. Fresnel considered it to be the direction of displacement. Others have maintained that it is the common perpendicular to the wave-normal and the displacement. Others again would define it as that component of the displacement which is perpendicular to the wave-normal. This of course would differ from Fresnel’s view only in case the displacements are not perpendicular to the wave-normal, and would in that case be a necessary modification of his view. Although this dispute has been one of the most celebrated in physics, it seems to be at length substantially settled, most * directly by experiments upon the scattering of light by small particles, which seems to show decisively that in isotropic media at least the displacements are normal to the “ plane of polarization,” and also, with hardly less cogency, by the difficulty of accounting for the intensities of reflected and refracted light on any other *Sir Wm. Thomson has shown that if an elastic incompressible solid in which the potential energy of any homogeneous strain is proportional to the sum of the squares of the reciprocals of the principal elongations minus three is subjected to any homogeneous strain by forces applied to its surface, the transmission of plane waves of distortion, superposed on this homogeneous strain, will follow exactly Fresnel’s law (including the direction of displacement), the three principal velocities being proportional to the reciprocals of the principal elongations. It must be a surprise to mathematicians and physicists to learn that a theorem of such simplicity and beauty has been waiting to be discovered in a field which has been so carefully gleaned. See page 116 of the current volume (xxv) of the Philosophical Magazine.ELASTIC AND ELECTRICAL THEORIES OF LIGHT. 225 supposition.* It should be added that all diversity of opinion on this subject has been confined to those whose theories are based on the dynamics of elastic bodies. Defenders of the electrical theory have always placed the electrical displacement at right angles to the “plane of polarization:” It will, however, be better to assume this direction of the displacement as probable rather than as absolutely certain, not so much because many are likely to entertain serious doubts on the subject, as in order not to exclude views which have at least a historical interest. The wave-velocity, then, for any constant period, is a quadratic function of the cosines of a certain direction, which is probably that of the displacement, but in any case determined by the displacement and the wave-normal. The coefficients of this quadratic function are functions of the period of vibration. It is important to notice that these coefficients vary separately, and often quite differently, with the period, and that the case does not at all resemble that of a quadratic function of the direction-cosines multiplied by a quantity depending on the period. In discussing the dynamics of the subject we may gain something in simplicity by considering a system of stationary waves, such as results from two similar systems of progressive waves moving in opposite directions. In such a system the energy is alternately entirely kinetic and entirely potential. Since the total energy is constant, we may set the average kinetic energy per unit of volume at the moment when there is no potential energy, equal to the average potential energy per unit of volume when there is no kinetic energy.! We may call this the equation of energies. It will contain the quantities l and _p, and thus furnish an expression for the velocity of either system of progressive waves. We have to see whether the elastic or the electric theory gives the expression most conformed to the facts. Let us first apply the elastic theory to the case of the so-called * “At the same time, if the above reasoning be valid, the question as to the direction of the vibrations in polarized light is decided in accordance with the view of Fresnel. ... I confess I cannot see any room for doubt as to the result it leads to. . . . I only mean that if light, as is generally supposed, consists of transversal vibrations similar to those which take place in an elastic solid, the vibration must be normal to the plane of polarization.” Lord Rayleigh “ On the Light from the Sky, its Polarization and Color;” Phil. Mag. (4), xli (1871), p. 109. “ Green’s dynamics of polarization by reflexion, and Stokes’ dynamics of the diffraction of polarized light, and Stokes’ and Rayleigh’s dynamics of the blue sky, all agree in, as it seems to me, irrefragably, demonstrating Fresnel’s original conclusion, that in plane polarized light the line of vibration is perpendicular to the plane of polarization.” Sir Wm. Thomson, loc. citat. + The terms kinetic energy and potential energy will be used in this paper to denote these average values.226 ELASTIC AND ELECTRICAL THEORIES OE LIGHT. vacuum. If we write h for the amplitude measured in the middle between two nodal planes, the velocities of displacement will be as h . . h2 -, and the kinetic energy will be represented by A-%, where A is a constant depending on the density of the medium. The potential energy, which consists in distortion of the medium, may be represented h2 by B^, where B is a constant depending on the rigidity1 of the medium. The equation of energies, on the elastic theory, is therefore A*-l£ p2 l2 (1) which gives V ~p*~A‘ (2) In the electrical theory, the kinetic energy is not determined by the simple formula of ordinary dynamics from the square of the velocity of each element, but is found by integrating the product of the velocities of each pair of elements divided by the distance between them. Very elementary considerations suffice to show that a quantity thus determined when estimated per unit of volume will vary as the h2 square of the wave-length. We may therefore set Fl2—^ for the kinetic P energy, F being a constant. The potential energy does not consist in distortion of the medium, but depends: upon an elastic resistance to the separation of the electricities, which constitutes the electrical displacement, and is proportioned to the square of this displacement. The average value of the potential energy per unit of volume will therefore be represented in the electrical theory by Qh2, where G is a constant, and the equation of energies will be which gives h2 ¥l2~ = Gh2 p1 F' (3) (4) Both theories give a constant velocity, as is required. But it is instructive to notice the profound difference in the equations of energy from which this result is derived. In the elastic theory the square of the wave-length appears in the potential energy as a divisor; in the electrical theory it appears in the kinetic energy as a factor. Let us now consider how these equations will be modified by the presence of ponderable matter, in the most general case of transparent and sensibly homogeneous bodies. This subject is rendered much more simple by the fact that the distances between the ponderable molecules are very small compared with a wave-length. Or, whatELASTIC AND ELECTRICAL THEORIES OF LIGHT. 227 amounts to the same thing, but may present a more distinct picture to the imagination, the wave-length may be regarded as enormously great in comparison with the distances between neighboring molecules. Whatever view we take of the motions which constitute light, we can hardly suppose them (disturbed as they are by the presence of the ponderable molecules) to be in strictness represented by the equations of wave-motion. Yet in a certain sense a wave-motion may and does exist. If, namely, instead of the actual displacement at any point, we consider the average displacement in a space large enough to contain an immense number of molecules, and yet small as measured by a wave-length, such average displacements may be represented by the equations of wave-motion; and it is only in this sense that any theory of wave-motion can apply to the phenomena of light in transparent bodies. When we speak of displacements, amplitudes, velocities (of displacement), etc., it must therefore be understood in this way. The actual kinetic energy, on either theory, will evidently be greater than that due to the motion thus averaged or smoothed, and to a degree presumably depending on the direction of the displacement. But since displacement in any direction may be regarded as compounded of displacements in three fixed directions, the additional energy will be a quadratic function of the components of velocity of displacement, or, in other words, a quadratic function of the direction-cosines of the displacement multiplied by the square of the amplitude and divided by the square of the period.* This additional energy may be understood as including any part of the kinetic energy of the wave-motion which may belong to the ponderable particles. The term to be added to the kinetic energy on the electric theory may h2 therefore be written where /D is a quadratic function of the direction-cosines of the displacement. The elastic theory requires a term of precisely the same character, but since the term to which it is to be added is of the same general form, the two may be incorporated in a single term of the form AD^where AD is a quadratic function of the direction-cosines of the displacement. We must, however, notice that both AD and /D are not entirely independent of the period. For the manner in which the flux of the luminiferous medium is distributed among the ponderable molecules will naturally depend somewhat upon the period. The same is true of the degree to which the molecules may be thrown into vibration. But AD and fjy will be independent of the wave-length (except so far as this is * For proof in extemo of this proposition, when the motions are supposed electrical, the reader is referred to page 187 of this volume.228 ELASTIC AND ELECTRICAL THEORIES OF LIGHT. connected with the period), because the wave-length is enormously great compared with the size of the molecules and the distances between them. The potential energy on the elastic theory must be increased by a term of the form bB h2, where is a quadratic function of the direction-cosines of the displacement. For the ponderable particles must oppose a certain elastic resistance to the displacement of the ether, which in seolotropic bodies will presumably be different in different directions. The potential energy on the electric theory will be represented by a single term of the same form, say GD h2, where a quadratic function of the direction-cosines of the displacement, GD, takes the place of the constant G, which was sufficient when the ponderable particles were absent. Both GD and bB will vary to some extent with the period, like AD and/D, and for the same reason. In regard to that potential energy, which on the elastic theory is independent of the direct action, of the ponderable molecules, it has been supposed that in seolotropic bodies the effect of the molecules is such as to produce an seolotropic state in the ether, so that the energy of a distortion varies with its orientation. This part of the potential h2 . . energy will then be represented by BND where BND is a function of the directions of the wave-normal and the displacement. It may easily be shown that it is a quadratic function both of the direction-cosines of the wave-normal and of those of the displacement. Also, that if the ether in the body when undisturbed is not in a state of stress due to forces at the surface of the body, or if its stress is uniform in all directions, like a hydrostatic pressure, the function BND must be symmetrical with respect to the two sets of direction-cosines. The equation of energies for the elastic theory is therefore — ®nd 'jpl'\‘bT)h2) (5) which gives V2 = —=. (6) p2 Ajy — bjyp2 v 7 The equation of energies for the electrical theory is ^2|+/b| = Gd^, (7) which gives V2=p=^-A. (8) It is evident at once that the electrical theory gives exactly the form that we want. For any constant period the square of the wave-velocity is a quadratic function of the direction-cosines of the displacement. When the period varies, this function varies,ELASTIC AND ELECTRICAL THEORIES OF LIGHT. 229 the different coefficients in the function varying separately, because Gd and /d will not in general be similar functions* If we consider a constant direction of displacement while the period varies, GD and f-Q will only vary so far as the type of the motion varies, i.e., so far as the manner in which the flux distributes itself among the ponderable molecules and intermolecular spaces, and the extent to which the molecules take part in the motion are changed. There are cases in which these vary rapidly with the period, viz., cases of selective absorption and abnormal dispersion. But we may fairly expect that there will be many cases in which the character of the motion in these respects will not vary much with the period. G f -jr and ~ will then be sensibly constant and we have an approximate expression for the general law of dispersion, which agrees remarkably well with experiment.! If we now return to the equation of energies obtained from the elastic theory, we see at once that it does not suggest any such relation as experiment has indicated, either between the wave-velocity and the direction of displacement, or between the wave-velocity and the period. It remains to be seen whether it can be brought to agree with experiment by any hypothesis not too violent. In order that V2 may be a quadratic function of any set of direction-cosines, it is necessary that AD and bB shall be independent of the direction of the displacement, in other words, in the case of a crystal like Iceland spar, that the direct action of the ponderable molecules upon the ether, shall affect both the kinetic and the potential energy in the same way, whether the displacement take place in the direction of the optic axis or at right angles to it. This is contrary to everything which we should expect. If, nevertheless, we make this supposition, it remains to consider BND. This must be a quadratic function of a certain direction, which is almost certainly that of the displacement. If the medium is free from external stress (other than hydrostatic), BND, as we have seen, is symmetrical with respect to the wave-normal and the direction of displacement, and a quadratic function of the direction-cosines of each. The only single direction of which it can be a function is the common perpendicular to these two directions. If the wave-normal and the displacement are perpendicular, the direction-cosines * But Gd,/i>, and V2, considered as functions of the direction of displacement, are all subject to any law of symmetry which may belong to the structure of the body considered. The resulting optical characteristics of the different crystallographic systems are given on pages 192-194. tThis will appear most distinctly if we consider that V divided by the velocity of light in vacuo gives the reciprocal of the index of refraction, and p multiplied by the same quantity gives the wave-length in vacuo.230 ELASTIC AND ELECTRICAL THEORIES OF LIGHT. of the common perpendicular to both will be linear functions of the direction-cosines of each, and a quadratic function of the direction-cosines of the common perpendicular will be a quadratic function of the direction-cosines of each. We may thus reconcile the theory with the law of double refraction, in a certain sense, by supposing that Ad and are independent of the direction of displacement, and that BND and therefore V2 is a quadratic function of the direction-cosines of the common perpendicular to the wave-normal and the displacement. But this supposition, besides its intrinsic improbability so far as Ad and are concerned, involves a direction of the displacement which is certainly or almost certainly wrong. We are thus driven to suppose that the undisturbed medium is in a state of stress, which, moreover, is not a simple hydraulic stress. In this case, by attributing certain definite physical properties to the medium, we may make the function BND become independent of the direction of the wave-normal, and reduce to a quadratic function of the direction-cosines of the displacement.* This entirely satisfies Fresnel’s Law, including the direction of displacement, if we can suppose Ad and bD independent of the direction of displacement. But this supposition, in any case difficult for aeolotropic bodies, seems quite irreconcilable with that of a permanent (not hydrostatic) stress. For this stress can only be kept up by the action of the ponderable molecules, and by a sort of action which hinders the passage of the ether past the molecules. Now the phenomena of reflection and refraction would be very different from what they are, if the optical homogeneity of a crystal did not extend up very close to the surface. This implies that the stress is produced by the ponderable particles in a very thin lamina at the surface of the crystal, much less in thickness, it would seem probable, than a wave-length of yellow light. And this again implies that the power of the ponderable particles to pin down the ether, as it were, to a particular position is very great, and that the term in the energy relating to the motion of the ether relative to the ponderable particles is very important. This is the term containing the factor 6D, which it is difficult to suppose independent of the direction of displacement because the dimensions and arrangement of the particles are different in different directions. But our present hypothesis has brought in a new reason for supposing to depend on the direction of displacement, viz., on account of the stress of the medium. A general displacement of the medium midway between two nodal planes, when it is restrained at innumerable points by the ponderable particles, will See note on page 224.ELASTIC AND ELECTRICAL THEORIES OF LIGHT. 231 produce special distortions due to these particles. The nature of these distortions is wholly determined by the direction of displacement, and it is hard to conceive of any reason why the energy of these distortions should not vary with the direction of displacement, like the energy of the general distortion of the wave-motion, which is partly determined by the displacement and partly by the wave-normal.* But the difficulties of the elastic theory do not end with the law of double refraction, although they are there more conspicuous on account of the definite and simple law by which they can be judged. It does not easily appear how the equation of energies can be made to give anything like the proper law of the dispersion of colors. Since for given directions of the wave-normal and displacement, or in an isotropic body, BND is constant, and also AD and 6D, except so far as the type of the vibration varies, the formula requires that the square of the index of refraction (which is inversely as V2) should be equal to a constant diminished by a term proportional to the square of the period, except so far as this law is modified by a variation of the type of vibration. But experiment shows nothing like this law. Now the variation in .the type of vibration is sometimes very important,—it plays the leading rôle in the phenomena of selective absorption and abnormal dispersion,—but this is certainly not always the casé. It seems hardly possible to suppose that the type of vibration is always so variable as entirely to mask the law which is indicated by the formula when AD and bB (with BND) are regarded as constant. This is especially evident when we consider that the effect on the wave-velocity of a small variation in the type of vibration will be a small quantity of the second order.! The phenomena of dispersion, therefore, corroborate the conclusion which seemed to follow inevitably from the law of double refraction alone. * The reader may perhaps ask how the above reasoning is to be reconciled with the fact that the law of double refraction has been so often deduced from the elastic theory. The troublesome terms are feD and the variable part of AD, which express the direct action of the ponderable molecules on the ether. So far as the (quite limited) reading and recollection of the present writer extend, those who have sought to derive the law of double refraction from the theory of elastic solids have generally either neglected this direct action—a neglect to which Professor Stokes calls attention more than once in his celebrated “Report on Double Refraction” {Brit. Assoc., 1862, pp. 264, 268)—or taking account of this action they have made shipwreck upon a law different from FresnePs and contradicted by experiment. tSee pages 190, 191 of this volume, or Lord Rayleigh’s Theory of Sound, vol. i, p. 84.XV. A COMPARISON OF THE ELECTRIC THEORY OF LIGHT AND SIR WILLIAM THOMSON’S THEORY OF A QUASI-LABILE ETHER. [American Journal of Science^ ser. 3, vol. xxxvn, pp. 139-144, February, 1889.] A remarkable paper by Sir William Thomson, in the November number of the Philosophical Magazine, has opened a new vista in the possibilities of the theory of an elastic ether. Since the general theory of elasticity gives three waves characterized by different directions of displacement for a single wave-plane, while the phenomena of optics show but two, the first point in accommodating any theory to observation, is to get rid (absolutely or sensibly) of the third wave. For this end, itT has been common to make the ether incompressible, or, as it is sometimes expressed, to make the velocity of the third wave infinite. The velocity of the wave of compression becomes in fact infinite as the compressibility vanishes. Of course it has not escaped the notice of physicists that we may also get rid of the third wave by making its velocity zero, as may be done by giving certain values to the constants which express the elastic properties of the medium, but such values have appeared impossible, as involving an unstable state of the medium. The condition of incompressibility, absolute or approximate, has therefore appeared necessary.* This question of instability has now, however, been subjected to a more searching examination, with the result that the instability does not really exist “provided we either suppose the medium to extend all through boundless space, or give it a fixed containing vessel as its boundaryThis renders possible a very simple theory of light, which has been shown to give Fresnel’s laws for the intensities of reflected and refracted light and for double refraction, so far as concerns the phenomena which can be directly observed. The displacement in an aeolotropic medium is in the same plane passing through the wave-normal as was supposed by Fresnel, * It was under this impression that the paper entitled “A Comparison of the Elastic and the Electric Theories of Light with respect to the Law of Double Refraction and the Dispersion of Colors,” [this volume pp. 223-231], was written. The conclusions of that paper, except so far as respects the dispersion of colors, will not apply to the new theory.COMPARISON OF THE ELECTRIC THEORY OF LIGHT, ETC. 233 but its position in that plane is different, being perpendicular to the ray instead of to the wave-normal* It is the object of this paper to compare this new theory with the electric theory of light. In the limiting cases, that is, when we regard the velocity of the missing wave in the elastic theory as zero, and in the electric theory as infinite, we shall find a remarkable correspondence between the two theories, the motions of monochromatic light within isotropic or aeolotropic media of any degree of transparency or opacity, and at the boundary between two such media, being represented by equations absolutely identical, except that the symbols which denote displacement in one theory denote force in the other, and vice versa, t In order to exhibit this correspondence completely and clearly, it is necessary that the fundamental principles of the two theories should be treated with the same generality, and, so far as possible, by the same method. The immediate consequences of the new theory will therefore be deduced with the same generality and essentially by the same method which has been used with reference to the electric theory in a former volume of this Journal [page 211 of this volume]. The elastic properties of the ether, according to the new theory, in its limiting case, may be very simply expressed by means of a vector operator, for which we shall use Maxwell’s designation. The curl of a vector is defined to be another vector so derived from the first that if u, v, w be the rectangular components of the first, and u', v', w', those of its curl, , _ dw __ dv f _du ^ dw r _dv du U ~~ dy dz’ V ~~ dz dx ’ w ~~ dx~~ dy ’ ' ' where x, y, z are rectangular coordinates. With this understanding, if the displacement of the ether is represented by the vector d, the force exerted upon any element by the surrounding ether will be — B curl curl @ dx dy dz, (2) where B is a scalar (the so-called rigidity of the ether) having the same constant value throughout all space, whether ponderable matter is present or not. Where there is no ponderable matter, this force must be equated to the reaction of the inertia of the ether. This gives, with omission of the common factor dx dy dz, A(5= — B curl curl @, (3) where A denotes the density of the ether. * Sir William Thomson, loc. citat. R. T. Glazebrook, Phil. Mag., December, 1888. t In giving us a new interpretation of the equations of the electric theory, the author of the new theory has in fact enriched the mathematical theory of physics with something which may be compared to the celebrated principle of duality in geometry.234 COMPARISON OF THE ELECTRIC THEORY OF LIGHT The presence of ponderable matter disturbs the motions of the ether, and render^ them too complicated for us to follow in detail. Nor is this necessary, for the quantities which occur in the equations of optics represent average values, taken over spaces large enough to smooth out the irregularities due to the ponderable particles, although very small as measured by a wave-length.* Now the general principles of harmonic motion t show that to maintain in any element of volume the motion represented by 2m L p, (4) 31 being a complex vector constant, will require a force from outside represented by a complex linear vector function of that is, the three components of the force will be complex linear functions of the three components of (E We shall represent this force by B'T'd dx dy dz, (5) where represents a complex linear vector function. J If we now equate the force required to maintain the motion in any element to that exerted upon the element by the surrounding ether, we have the equation ^ = _ curl curl (6) which expresses the general law for the motion of monochromatic light within any sensibly homogeneous medium, and may be regarded as implicitly including the conditions relating to the boundary of two such media, which are necessary for determining the intensities of reflected and refracted light. For let u, vt w be the components of <5, < v', w' jj » curl d, u" > w" » „ curl curl e, so that _dw dv , du dw dv du vf- ~dy ~~dz' v ~dz "da’ w' ~~ dx it dw „ du' dw' it _dv' _du' u =-r-dy dz’ v ~~dx~ dx * w dx dy and let the interface be perpendicular to the axis of Z. It is evident * This is in no respect different from what is always tacitly understood in the theory of sound, where the displacements, velocities, densities considered are always such average values. But in the theory of light, it is desirable to have the fact clearly in mind on account of the two interpenetrating media (imponderable and ponderable), the laws of light not being in all respects the same as they would be for a single homogeneous medium. + See Lord Rayleigh’s Theory of Sound, vol. i, chapters iv, v. %lt amounts essentially to the same thing, whether we regard the force as a linear vector function of (g or of (§, since these differ only by the constant factor - 4tt2 But there are some advantages in expressing the force as a function of (5, because the greater part of the force, in the most important cases, is required to overcome the inertia of the ether, and is thus more immediately connected with (§.AND THE THEORY OF A QUASI-LABILE ETHER 235 that if vf or v is discontinuous at the interface, the value of u" or v” becomes in a sense infinite, i.e., curl curl ©, and therefore by (6) “SI*©, will be infinite. Now both © and 'SEr are discontinuous at the interface, but infinite values for SE© aré not admissible. Therefore u' and v' are continuous. Again, if u or v is discontinuous, u' or v will become infinite, and therefore u" or v". Therefore u and v are continuous. These conditions may be expressed .in the most general manner by saying that the components of © and curl © parallel to the interface are continuous. This gives four complex scalar conditions, or in all eight scalar conditions, for the motion at the interface, which are sufficient to determine the amplitude and phase of the two reflected and the two refracted rays in the most general case. It is easy, however, to deduce from these four complex conditions, two others, which are interesting and sometimes convenient. It is evident from the definitions of w' and w" that if utvy u', and v' are continuous at the interface w' and w" will also be continuous. Now —w" is equal to the component of 'T© normal to the interface. The following quantities are therefore continuous at the interface: the components parallel to the interface of ©, "I the component norma] to the interface of 'SE'©, J- (7) all components of curl ©.J To compare these results with those derived from the electrical theory, we may take the general equation of monochromatic light on the electrical hypothesis from a paper in a former volume of this Journal. This equation, which with an unessential difference of notation may be written * -Potg-VQ = 47r$g, (8) was established by a method and considerations similar to those which have been used to establish equation (6), except that the ordinary law of electrodynamic induction had the place of the new law of elasticity. $ is a complex vector representing the electrical displacement as a harmonic function of the time; 4? is a complex linear vector operator, such that 47r5 represents the electromotive force necessary to keep up the vibration Q is a complex scalar representing the electrostatic potential, VQ the vector of which the three components are dQ dQ dQ dx* dy ’ dz * Pot denotes the operation by which in the theory of gravitation the potential is calculated from the density of matter.! When it is * See page 218 of this volume, equation (12). t The symbol - Pot is therefore equivalent to 4ttV~2, as used by Sir William Thomson (with a happy economy of symbols) at the last meeting of British Association to express the same law of electrodynamic induction, except that the symbol is here used as a vector operator. See Nature, vol. xxxviii, p. 571, sub init.236 COMPARISON OF THE ELECTRIC THEORY OF LIGHT applied as here to a vector, the three components of the result are to be calculated separately from the three components of the operand. — VQ is therefore the electrostatic force, and —Pot § the electrodynamic force. In establishing the equation, it was not assumed that the electrical motions are solenoidal, or such as to satisfy the so-called “ equation of continuity.” We may now, however, make this assumption, since it is the extreme case of the electric theory which we are to compare with the extreme case of the elastic. It results from the definitions of curl and V that curl VQ = 0. We may therefore eliminate Q from equation (8) by taking the curl. This gives — curl Pot $ — 4tt curl (9) Since curl curl and ^ Pot are inverse operators for solenoidal vectors, we may get rid of the symbol Pot by taking the curl again. We thus get — % — curl curl (10) The conditions for the motion at the boundary between different media are easily obtained from the following considerations. Potfj and Q are evidently continuous at the interface. Therefore thé components parallel to the interface of VQ, and by (8) of will be continuous. Again, curl Pot^ is continuous at the interface, as appears from the consideration that curl Pot is the magnetic force due to the electrical motions Therefore, by (9), curl is continuous. The solenoidal condition requires that the component of $ normal to the interface shall be continuous. The following quantities are therefore continuous at the interface : the components parallel to the interface of ) the component normal to the interface of $, l (11) all components of curl <§3*.J Of these conditions, the two relating to the normal components of $ and curl $$ are easily shown to result from the other four conditions, as in the analogous case in the elastic theory. If we now compare in tiie two theories the differential equations of the motion of monochromatic light for the interior of a sensibly homogeneous medium, (6) and (10), and the special conditions for the boundary between two such media as represented by the continuity of the quantities (7) and (11), we find that these equations and conditions become identical, if _ <|)-i (12) (13) (14)AND THE THEORY OF A QUASI-LABILE ETHER. 237 In other words, the displacements in either theory are subject to the same general and surface conditions as the forces required to maintain the vibrations in an element of volume in the other theory. To fix our ideas in regard to the signification of 'fr and §?, we may consider the case of isotropic media, in which these operators reduce to ordinary algebraic quantities, simple or complex. Now the curl of any vector necessarily satisfies the solenoidal condition (the so-called “ equation of continuity ”), therefore by (6) ''F® and @ will be solenoidal. So also will $ and in the electrical theory. Now for solenoidal vectors i d2 d2 d2 -curlcurl=^+3/+^’ (15) so that the equations (6) and (10) reduce to ^“(ap+a^+ap)8. (16) ¿y ( d2 d2 d2 \ ^ %~\dx*+dy2+dz*r%' (17) For a simple train of waves, the displacement, in be represented by a constant multiplied by either theory, may i(f7 t+ax+by+cz) c (18) Our equations then reduce again to #2^@=(a2+&2+c2)@, (19) ^ = (a2+62+c2)$^. (20) Hence cc2-b 62+c2‘ (21) The last member of this equation, when real, evidently expresses the square of the velocity of light. If we set n2 = k2 — = G’ r_ -, d(K) = d(i*y (33) It is to be observed that if we should assume for a dispersion-formula n-2 = a-b\-2} (34) 1/a, which is the square of the index of refraction for an infinite wave-length, would be identical with the second member of (33). Another similarity between the electrical and optical properties of bodies consists in the relation between conductivity and opacity. Bodies in which electrical fluxes are attended with absorption of energy absorb likewise the energy of the motions which constituteAND THE THEORY OF A QTJASI-LABILE ETHER. 245 light. This is strikingly true of the metals. But the analogy does not stop here. To fix our ideas, let us consider the case of an isotropic body and circularly polarized light, which is geometrically the simplest case although its analytical expression is not so simple as that of plane-polarized light. The displacement at any point may be symbolized by the rotation of a point in a circle. The external force necessary to maintain the displacement $ is represented by n~2gr. In transparent bodies, for which n~2 is a positive number, the force is radial and in the direction of the displacement, being principally employed in counterbalancing the dielectric elasticity, which tends to diminish the displacement. In a conductor n~2 becomes complex, which indicates a component of the force in the direction of that is, tangential fco the circle. This is only the analytical expression of the fact above mentioned. But there is another optical peculiarity of metals, which has caused much remark, viz., that the real part of n2 (and therefore of n~2) is negative, i.e., the radial component of the force is directed towards the center. This inwardly directed force, which evidently opposes the electrodynamic induction of the irregular part of the motion, is small compared with the outward force which is found in transparent bodies, but increases rapidly as the period diminishes. We may say, therefore, that metals exhibit a second optical peculiarity,—that the dielectric elasticity is not prominent as in transparent bodies. This is like the electrical behavior of the metals, in which we do not observe any elastic resistance to the motion of electricity, We see, therefore, that the complex indices of metals, both in the real and the imaginary part of their inverse squares, exhibit properties corresponding to the electrical behavior of the metals. The case is quite different in the elastic theory. Here the force from outside necessary to maintain in any element of volume the displacement (£ is represented by n2& In transparent bodies, therefore, it is directed toward the center. In metals, there is a component in the direction of the motion ©, while the radial part of the force changes its direction and is often many times greater than the opposite force in transparent bodies. This indicates that in metals the displacement of the ether is resisted by a strong elastic force, quite enormous compared to anything of the kind in transparent bodies, where it indeed exists, but is so small that it has been neglected by most writers except when treating of dispersion. We can make these suppositions, but they do not correspond to anything which we know independently of optical experiment. It is evident that the electrical theory of light has a serious rival, in a sense in which, perhaps, one did not exist before thé publication246 COMPARISON OF THE ELECTEIC THEOKY OF LIGHT of Sir William Thomson’s paper in November last.* Nevertheless, neither surprise at the results which have been achieved, nor admiration for that happy audacity of genius, which, seeking the solution of the problem precisely where no one else would have ventured to look for it, has turned half a century of defeat into victory, should blind us to the actual state of the question. It may still be said for the electrical theory, that it is not obliged to invent hypotheses,! but only to apply the laws furnished by ohe science of electricity, and that it is difficult to account for the coincidences between the electrical and optical properties of media, unless we regard the motions of light as electrical. But if the electrical character of light is conceded, the optical problem is very different from anything which existed in the time of Fresnel, Cauchy, and Green. The third wave, for example, is no longer something to be gotten rid of quocunque modo, but something which we must dispose of in accordance with the laws of electricity. This would seem to rale out the possibility of a relatively small velocity for the third wave. * “ Since the first publication of Cauchy’s work on the subject in 1830, and of Green’s in 1837, many attempts have been made by many workers to find a dynamical foundation by Fresnel’s laws of reflexion and refraction of light, but all hitherto ineffectually.” Sir William Thomson, loc. citat. “ So far as I am aware, the electric theory of Maxwell is the only one satisfying these conditions (of explaining at once Fresnel’s laws of double refraction in crystals and those governing the intensity of reflexion when light passes from one isotropic medium to another).” Lord Rayleigh, Phil. Mag.> September, 1888. f Electrical motions in air, since the recent experiments of Professor Hertz, seem to be no longer a matter of hypothesis. We can hardly suppose that the case is essentially different with the so-called vacuum. The theorem that the electrical motions of light are solenoidal, although it is convenient to assume it as a hypothesis and show that the results agree with experiment, need not occupy any such fundamental position in the theory. It is in fact only another way of saying that two of the Constants of electrical science have a certain ratio (infinity). It would be easy to commence without assuming this value, and to show in the course of the development of the subject that experiment requires it, not of course as an abstract proposition, but in the sense in which experiment can be said to require any values of any constants* that is, to a certain degree of approximation.XVI. REVIEWS OF NEWCOMB AND MICHELSON’S “ VELOCITY OF LIGHT IN AIR AND REFRACTING MEDIA ” AND OF KETTELER’S “ THEORETISOHE OPTIK.” [American Journal of Science, ser. 3, vol. xxxi. pp. 62-67, Jan. 1886.] Velocity of Light in Air and Refracting Media. Astrononomkal Papers prepared for the use of the American Ephemeris and Nautical Almanac, vol. II. parts 3 and 4,* Washington, 1885. Professor Newcomb obtains as the final result of his experiments at Washington 299,860 ±30 kilometers per second for the velocity of light in vacuo. Professor Michelson’s entirely independent experiments at Cleveland give substantially the same result (299,853 ±60) His former experiments at the Naval Academy, after correction of two small errors which he now reports, give 299,910 ±50. All these experiments were made with the revolving mirror, but the arrangements of the two experimenters were in other respects radically different. The first of these values of the velocity of light with Nyren’s value of the constant of aberration (20"*492) gives 149*60 for the distance of the sun in millions of kilometers. On acount of the recent announcement by Messrs. Young and Forbes of a difference of about two per cent, in the velocities of red and blue light, especial attention was paid to this point by both experimenters, without finding the least indication of any difference. In Professor Newcomb’s experiments, a difference of' only one thousandth in these velocities would have produced a well-marked iridescence on the edges of the return image of the slit formed by reflection from the revolving mirror. No trace of such iridescence could ever be seen. Professor Michelson made an experiment, in which a red glass covered one-half the slit. The two halves of the image—the upper white, the lower red—were exactly in line. Since Maxwell’s electromagnetic theory of light makes the velocity of light in air equal to the ratio of the electromagnetic and * [Part 3, “Measures of the Velocity of Light,” S. Newcomb; part 4, “Supplementary Measures of the Velocities of white and colored Light in air, water, and carbon disulphide,” A. A. Michelson.]248 VELOCITY OF LIGHT IN AIR. electrostatic units of electricity, it will be interesting to compare some recent determinations of this ratio. These we give in the following table. Since the determinations are affected by any error in the standard of resistance, we have corrected the results, first, on the supposition that the B.A. ohm = *987 true ohms (Lord Rayleigh’s result), and secondly, on the supposition that the B.A. ohm = '989 true ohms, which is essentially assuming that the legal ohm represents the true value. Ratio of Electromagnetic and Electrostatic units of Electricity in millions of meters and seconds. Date. As published. B.A. ohm = *987. B.A. ohm = *9S9. Ayrton & Perry,* 1878 298*0 296-1 296-4 Hockin, t 1879 298-8 296*9 297 2 Shida, Z 1880 299-5 295-6 296-2 Exner, § 1882 301-1 (?) 291-7 (?) 292-3 (?) J. J. Thomson, || 1883 296-3 296-3 296-9 Klemen&ci, H 1884 301-88 (?) 301 *88 (?) 302-48 (?) These numbers are to be compared with the velocity of light in air, in millions of meters per second, for which Professor Newcomb gives 299*778. Of the electrical determinations, that of J. J. Thomson appears by far the most worthy of confidence. That of Klemencic— the only one as great as the velocity of light—was obtained by the use of a condenser with glass,—a method which would presumably give too great a ratio. Exner’s value is obtained from the mean of three determinations, one of which differed from the others by about three per cent. If we reject this discordant determination, the mean of the other two would give when corrected for resistance 294*4 and 295*0. If we set aside the determinations of Exner and Klemen8i£, the remaining four, which represent three different methods, are very accordant, the mean being nearly identical with the result of J. J. Thomson, and about one per cent, less than the velocity of light. Professor Michelson’s experiments on the velocity of light in carbon disulphide afford an interesting illustration of the difference between the velocity of waves and the velocity of groups of waves—a subject which is treated at length in an appendix to the second volume of Lord Rayleigh’s Theory of Bound. If we write V for the velocity of waves, U for that of a group of waves, L for the wave-length, and T for the period of vibration, V = U: ¿(T-1) 'T' "-d(L-iy For purposes of numerical calculation, it will be convenient to transform these formulae by the use of A for the wave-length in * Phil. Mag., (5), vol. vii, p. 277. + Report Brit. Assoc., 1879, p. 285. %Phil. Mag., (5), vol. x, p. 431. % Sitzungsberichte Wien. Alcad., vol. lxxxvi, p. 106. || Phil. Trans., vol. clxxiv, p. 707. 1¡Sitzungsberichte Wien. Ahad., vol. lxxxix, p. 298.VELOCITY OF LIGHT IN AIR 249 vacuo, n for the index of refraction of the medium considered, and k for the velocity of light m vacuo, which we shall regard as constant, in accordance with general usage. By substitution of these letters we easily obtain k _ k _d{n\~1) V~n’ U= d(\-1) • The data for the calculation of these quantities for carbon disulphide are given by Yerdet {Annates de Chimin et de Physique, (3), vol. lxix, p. 470). They give for the line D, k/V = 1*624, k/JJ = 1*722, for the line E, k/V = 1*637, k/U = 1-767. The quotient of the velocity in vacuo divided by the velocity in carbon disulphide, according to Professor Michelson’s experiments with the light of an arc lamp, is 1*76 ±*02, which agrees very well wit Ir k/U. Another theory, which would make the velocity observed in such experiments V2/U {Nature, vol. xxv, p. 52), receives no countenance from these experiments. The value of HJ/Y2 would be about 1*53. Some may think that the experiments on water point in a different direction. Taking our data from Beer’s Einleitung in die höhere Optik, 1853, p. 411, we get forD, k/V —1*334, k/JJ = 1*352, 7cU/V2 = 1*316, for E, k/V —1*336, k/U = 1*359, MJ/Y2 = 1*313. The number obtained by experiment was 1,330, which agrees better with k/V, or even with kJJ/V2, than with k/JJ, but the differences are here too small to have much significance. Theoretische Optik, gegründet auf das Bessel-Sellmeier’sche Princip, zugleich mit den experimentellen Belegen. Yon Dr. E. Ketteler, Professor an der Universität in Bonn. Yiewig und Sohn. Braunschweig, 1885. The principle of Sellmeier, here referred to, relates to vibrations of ponderable particles excited by the etherial vibrations of light, and to the reaction of the former upon the latter. The name of Bessel is added on account of his previous solution of a somewhat analogous problem relating to the pendulum. The object of this work is “to treat theoretical optics in a complete and uniform manner on the new foundation of the simultaneous vibration of etherial and ponderable particles, and to substitute a consistent and systematic new structure for the present conglomerate of more or less disconnected principles.” Such a work demands a critical examination, which should not be250 THEORETISCHE OPTIK. undertaken from any narrow point of view. Any faults of detail will be readily forgiven, if the author shall give the theory of optics the 7rod arrco which it has sought so long in vain. We may add that if this effort shall not be judged successful by the scientific world, the author will at least have the satisfaction of being associated in his failure with many of the most distinguished names in mathematical physics. We have sought to test the proposed theory with respect to that law of optics which seems most conspicuous in its definite mathematical form, and in the rigor of the experimental verifications to which it has been subjected, as well as in the magnificent developments to which it has given rise: the law of double refraction due to Huyghens and Fresnel, and geometrically illustrated by the wave-surface of the latter. We cannot find that the law of Fresnel is proved at all in this treatise. We find on the contrary, that a law is deduced which is different from Fresnels, and inconsistent with it. We do not refer to anything relating to the direction of vibration of the rays in a crystal, which is a point not touched by the experimental verifications to which we have alluded. We shall confine our comparison to those equations from which the direction of vibration has been eliminated, and which therefore represent relations subject to experimental control. For this purpose equation (13) on page 299 is suitable. It reads U2 t V2 , w2 n2 — n2 n2 — n2 n2~ n2 = 0, nx, ny, nz being the principal indices of refraction. This the author calls the equation of the wave-surface or surface of ray-velocities. It has the form of the equation of Fresnel’s wave-surface, expressed in terms of the direction-cosines and reciprocal of the radius vector, and if u, v, w are the direction-cosines of the ray, and n the velocity of light in vacuo divided by the so-called ray-velocity in the crystal the equation will express Fresnel’s law. But it is impossible to give these meanings to u, vy tv and n. They are introduced into the discussion in the expression for the vibrations (p. 295), viz., p = leos2^(*-^±p±H)). The form of this equation shows that uy v, w are proportional to the direction-cosines of the wave-normal, and as the relation u2-\-v2-\-w2 — l is afterwards used, they must be the direction-cosines of the wave-normal. They cannot possibly denote the direction-cosines of the ray, except in the particular case in which the ray and wave-normal coincide. Again, from the form of this equation, X¡n must be the wave-length in the crystal, and if X here as elsewhere in the bookTHEORETISCHE OPTIK. 251 (see p. 25) denotes the wave-length in vacuo of light of the period considered, which we doubt not is the intention of the author, n must be the wave-length in vacuo divided by the wave-length in the crystal, i.e., the velocity of light in vacuo divided by the wave-velocity in the crystal. With these definitions of u, v, w, and n, equation (13) expresses a law which is different from Fresnel’s. Applied to the simple case of a uniaxial crystal, it makes the relation between the wave-velocity of the extraordinary ray and the angle of the wave-normal with the principal axis the same as that'of the radius vector and the angle in an ellipse. The law of Huyghens and Fresnel makes the reciprocal of the. wave-velocity stand in this relation. The law which our author has deduced has come up again and again in the history of theoretical optics. Professor Stokes (.Report of the British Assoc., 1862, part i, p. 269) and Lord Rayleigh {Phil. Mag., (4), vol. xli, p. 525) have both raised the question whether Huyghens and Fresnel might not have been wrong, and it might not be the wave-velocity and not its reciprocal which is represented by the radius vector in an ellipse. The difference is not very great, for if we lay off on the radii vectores of an ellipse distances inversely proportional to their lengths, the resultant figure will have an oval form approaching that of an ellipse when the eccentricity of the original ellipse is small. Rankine appears to have thought that the difference might'be neglected (see Phil. Mag., (4), vol. i, pp. 444, 445) at least he claims that his theory leads to Fresnel’s law, while really it would give the same law which our author has found. (Concerning Rankine’s “ splendid failure,” and the whole history of the subject, see Sir Wm. Thomson’s Lectures on Molecular Dynamics at the Johns Hopkins University, chap, xx.) Professor Stokes undertook experiments to decide the question. His result, corroborated by Glazebrook {Pro. Roy. Soc., vol. xx, p. 443; Phil. Trans., vol. clxxi, p. 421), was that Huyghens and Fresnel were right and that the other law was wrong. To return to our author; we have no doubt from the context that he regards u, v, w, and n as relating to the ray and not to the wave-normal. We suppose that that is the meaning of his remark that the expression for the vibrations (quoted above) is to be referred to the direction of the ray. It seems rather hard not to allow a writer the privilege of defining his own terms. Yet the reader will admit that when the vibrations have been expressed in the above form an inexorable necessity fixes the significance of the direction determined by u, v, w, and leaves nothing in that respect to the choice of the author. The historical sketches of the development of ideas in the theory of optics, enriched by very numerous references, will be useful to252 THEORETISCHE OPTIK. the student. An exception, however, must be made with respect to the statements concerning the electromagnetic theory of light. We are told (p. 450) that the English theory, founded by Maxwell and represented by Glazebrook and Fitzgerald, makes the plane of polarization coincide with the plane of vibration, while Lorentz, on the basis of Helmholtz’s equations comes to the conclusion that these planes are at right angles. Since all these writers make the electrical displacement perpendicular to the plane of polarization, we can only attribute this statement to some confusion between the electrical displacement and the magnetic force or “ displacement ” at right angles to it. We are also told that Glazebrook’s “ surface-conditions ” which determine the intensity of reflected and refracted light are different from those of Lorentz,—a singular error in view of the fact that Mr. Glazebrook (Proc. Gamb. Phil. Soc., vol. iv, p. 166) expressly states that his results are the same as those of Lorentz, Fitzgerald, and J. J. Thomson. We have spent much fruitless labor in trying to discover where and how the expressions were obtained which are attributed to Glazebrook, but in which the notation has been altered. They ought to come from Glazebrook’s equations (24)-(27) (loc. cit.), but these appear identical with Lorentz’s equations (58)-(61) (Zeit-schriftf. Math. u. Phys., vol. xxii, p. 27). They might be obtained by interchanging the expressions for vibrations in the plane of incidence and at right angles to it, with two changes of sign. The reader must be especially cautioned concerning the statements and implications of what has not been done in the electromagnetic theory. These are such as to suggest the question whether the author has taken the trouble to read the titles of the papers which have been published. We refer especially to what is said on pages 248, 249 concerning absorption, dispersion, and the magnetic rotation of the plane of polarization. In the Experimental Part, with which the treatise closes, we have a comparison of formulas with the results of experiments by the author and others. The author has been particularly successful in the formula for dispersion. In the case of quartz (p. 545), the formula (with four constants) represents the results of experiment in a manner entirely satisfactory through the entire range of wavelength from 2T4 to 0*214. Those who may not agree with the author’s theoretical views will nevertheless be glad to see the results of experiment brought together, and, so far as may be, represented by formulae.XVII. ON THE VELOCITY OF LIGHT AS DETERMINED BY FOUCAULT’S REVOLVING MIRROR. [Natute, vol. xxxm. p. 582, April 22, 1886.] It has been shown by Lord Rayleigh and others that the velocity (U) with which a group of waves is propagated in any medium may be calculated by the formula where V is the wave-velocity, and A the wave-length. It has also been observed by Lord Rayleigh that the fronts of the waves reflected by the revolving mirror in Foucault’s experiment are inclined one to another, and in consequence must rotate with an angular velocity dV d\a’ where a is the angle between two successive wave-planes of similar phase. When dV/d\ is positive (the usual case), the direction of rotation is such that the following wave-plane rotates towards the position of the preceding (see Nature, vol. xxv. p. 52). But I am not aware that attention has been called to the important fact, that while the individual wave rotates the wave-normal of the group remains unchanged, or, in other words, that if we fix our attention on a point moving with the group, therefore with the velocity U, the successive wave-planes, as they pass through that point, have all the same orientation. This follows immediately from the two formulae quoted above. For the interval of time between the arrival of two successive wave-planes of similar phase at the moving point is evidently A/(F— U), which reduces by the first formula to dX/dV. In this time the second of the wave-planes, having the angular velocity adV/dX, will rotate through an angle a towards the position of the first wave-plane. But a is the angle between the two planes, The second plane, therefore, in passing the moving point, will have exactly the same orientation which the first had. To get a picture of the phenomenon, we may imagine that we are able to see a few inches of the top of a moving carriage-wheel. The individual spokes rotate, while the group maintains a vertical direction.254 VELOCITY OF LIGHT. This consideration greatly simplifies the theory of Foucault’s experiment, and makes it evident, I think, that the results of all such experiments depend upon the value of U, and not upon that of V. The discussion of the experiment by following a single wave, and taking account of its rotation, is a complicated process, and one in which it is very easy to leave out of account some of the elements of the problem. The principal objection to it, however, is its unreality. If the dispersion is considerable, no wave which leaves the revolving mirror will return to it. The individual disappears, only the group has permanence. Prof. Schuster in his communication of March 11 (p. 430), has nevertheless obtained by this method, as the quantity determined by “the experiments hitherto performed,” Tr2/(2V—U), which, as he observes, is nearly equal to U. He would, I think, have obtained U precisely, if' for the angle between two successive wave-planes of similar phase, instead of 2w\/V, he had used the more exact value 2w\/U. By the kindness of Prof. Michelson, I am informed with respect to his recent experiments on the velocity of light in bisulphide of carbon that he would be inclined to place the maximum brilliancy of the light between the spectral lines D and E, but nearer to D. If we take the mean between D and E, we have K U = 1-745, ^^ = 1-737, K denoting the velocity in vacuo (see p. 249 of this volume). The number observed was 1*76, “with an uncertainty of two units in the second place of decimals.” This agrees best with the first formula. The same would be true if we used values nearer to the line D. J. Willard Gibbs. New Haven, Connecticut, April 1. [1886.]xyiii. VELOCITY OF PROPAGATION OF ELECTROSTATIC FORCE. [Nature, vol. mi. p. 509, ^-pril 2, 1896.] As we may have to wait some time for the experimental solution of Lord Kelvin's very instructive and suggestive problem concerning two pairs of spheres charged with electricity (see Nature of February 6, p. 316), it may be interesting to see what the solution would be from the standpoint of existing electrical theories. In applying Maxwell’s theory to the problem it will be convenient to suppose the dimensions of both pairs of spheres very small in comparison with the unit of length, and the distance between the two pairs very great in comparison with the same unit. These conditions, which greatly simplify the equations which represent the phenomena, will hardly be regarded as affecting the essential nature of the question proposed. Let us first consider what would happen on the discharge of (A, B), if the system (c, d) were absent. Let m0 be the initial value of the moment of the charge of the system (A, B), (this term being used in a sense analogous to that in which we speak of the moment of a magnet), and m the value of the moment at any instant. If we set m = F(i), (1) and suppose the discharge to commence when ¿ = 0, and to be completed when t — h, we shall have F(£) = md when ¿<0, (2) and F(£) = 0 when t>h, (3) Let us set the origin of coordinates at the centre of the system (A, B), and the axis of x in the direction of the centre of the positively charged sphere. A unit vector in this direction we shall call i, and the vector from the origin to the point considered p. At any point outside of a sphere of unit radius about the origin, the electrical displacement ($)) is given by the vector equation 4 tt® = [3r _ 5F (t — cr)+3 cr ~ 4F '(t — cr)+c2r ~ 3F "(t — cr)] xp — \r*;3F {t — er)+cr~2F/ (t — er) -f- c2r ~ XF" (t — cr)] i, (4) where F denotes the function determined by equation (1), F' and F" its derivatives, and e the ratio of electrostatic and electromagnetic256 PROPAGATION OF ELECTROSTATIC FORCE. units of electricity, or the reciprocal of the velocity of light. For this satisfies the general equation -V2® = c2c?2®/^2, (5) as well as the so-called “equation: of continuity,” and also satisfies the special conditions that when t <: 0 4x3) = m0(3r~ 5xp — r~3i) outside of the unit sphere, and that at any time at the surface of this sphere 4n® = m(3asP-i), if we consider the terms containing the factor c as negligible, when not compensated by large values of r. That equation (4) satisfies the general conditions is easily verified, if we set u = r~1¥(t — cr), (6) and observe that — V2u = c2d2uj dt2, (7) and that the three components of ® are given by the equations 4x/= — d2u/dy2 — d2n\dz2 j 4 7rg = d2u/dxd,y I (8) 4x/i = d2u/dxdz J Equation (4) shows that the changes of the electrical displacement are represented by three systems of spherical waves, of forms determined by the rapidity of the discharge of the system (A, B), which expand with the velocity of light with amplitudes diminishing as r~a, r~2, and r_1, respectively. Outside of these weaves, the electrical displacement is unchanged, inside of them it is zero. If we write (with Maxwell) —dyi/dt for the force of electrodynamic induction at any point, and suppose its rectangular components calculated from those of —d2<£)/dt2 by the formula used in calculating the potential of a mass from its density, we shall have by Poisson’s theorem V2{d%jdt) = 4nrd2<^)/dt2, or by (5), V2(d%ldt)= ~4xc"2V23), whence d$i/dt= —4xcr23). (9) From this, with (4), and the general equation