Music is a Super-Stimulus for the Perception of Psychological Distance in Proto-Music
Proto-Music
Proto-music is a term used generally to refer to any hypothetical ancestor of music considered to be sufficiently different from music as we know it that it deserves its own special term.
In this article I propose a specific hypothesis about the ancestor of modern human music, so I use the term “proto-music” to refer to the ancestor of modern music as proposed within my hypothesis.
My main hypothesis is that music, as we know it, is a super-stimulus for the perception of psychological distance in proto-music.
But to develop this hypothesis, first I have to explain how proto-music functioned as a system of communication, and how the perception of psychological distance evolved as a component of the perception of proto-music by its listeners.
Proto-Music was a system of communicating Shared Emotion
In its original form, proto-music was a form of communication. It was used by our hominid ancestors to communicate emotion, within a group, about something - the referent. The proto-music itself did not contain any information about what the referent was - it only expressed the actual emotion. It expressed a shared emotion, in effect saying to all those listening: “There is something which is important to all of us in the group, and this is the emotion that we should all be feeling about it.”
In its earliest usage, proto-music communicated shared emotion about something in the immediate “here-and-now”, and that something would normally be something of very immediate concern to all those in the group. In other words, it was obvious what the referent was.
This form of communication existed in order to support the lifestyle of our ancestors.
Our ancestors had a lifestyle which involved situations where all members of a group had to act in concert in real-time situations involving high risk and high reward, and it was important that all members of the group agreed as much as possible at all times about their evaluation of the current situation.
Very specifically, our ancestors had a lifestyle that included confrontational scavenging1, and this is a lifestyle that existed about 2 million years ago.
Prior to confrontational scavenging there was what we might call “non-confrontational scavenging”.
Non-confrontational scavenging was when there was a carcass of a dead animal, and all the large dangerous scavengers had already eaten all that they could eat (or that they could be bothered eating), and then our hominid ancestors came in with their stone tools and used those stone tools to cut or break the bones and extract marrow from inside those bones.
Confrontational scavenging was when there was a carcass of a dead animal, and the large dangerous scavengers hadn’t yet finished, but our ancestors decided to go in anyway. The large dangerous scavengers were larger and stronger than our hominid ancestors, and the only way our ancestors could hope to take them on or scare them away would be to act as a group in a very coordinated fashion.
At any moment in a confrontational scavenging situation, all members of the group would have to be in agreement about whether it was a “go” or a “no-go”. Also a “go” situation might start off OK, but then things could turn bad and at some point it would turn into a “let’s get out of here” situation.
The important thing is that all members of the group would have to be in agreement about their evaluation of the confrontational situation at all times. It would never be good for half of the group to be thinking that it was worth continuing and the other half thinking it was time to leave.
Confrontational scavenging and non-confrontational scavenging are distinct food-gathering strategies, but at the same time we can identify them as points on a continuum. This continuum would have facilitated the evolution from non-confrontational to confrontational. That is, even for hominids following a non-confrontational strategy, there would occasionally arise situations that might be moderately confrontational but not intensely so - perhaps a largish slightly dangerous scavenger needed to be scared away. Our hominid ancestors evolved or learned strategies to deal with these moderately confrontational situations, and these strategies then evolved incrementally over time to handle situations involving confrontation with larger and more dangerous scavengers.
Out of the “Here-and-Now”
Proto-music originally evolved as a system for communicating shared emotion in dangerous real-time scenarios that required highly coordinated group action.
But once it existed, it could apply to other situations.
However, because the referent was always implicitly something in the “here-and-now”, there was no way to use proto-music to talk about things beyond that here-and-now.
Examples of things that our ancestors might have wanted to talk about beyond the here-and-now would have included:
Situations in the past
Situations in the future
Situations in locations some distance from the present location
Situations that were purely hypothetical - they might have happened, or maybe they would happen, or maybe not at all.
We can use the term psychological distance to refer to the general quality of not being in the “here-and-now”. That is, in the here-and-now, the psychological distance is zero, and for things that just happened, or are just about to happen, or are happening close by, there is a small psychological distance, and for things in the distant past or distant future, or in a place faraway, there is a large psychological distance.
If the psychological distance is large enough, then the situation it applies to ceases to have any direct relevance to any actual decision that a group or individual needs to make. In other words it’s too far away to matter, or it’s too far in the past to matter, or it’s too far in the future to matter.
Any very large psychological distance is practically the same as any other very large psychological distance, in the sense that any situation it applies to has no practical relevance to any actual decision that needs to be made, and it becomes indistinguishable from a purely hypothetical situation, where one thinks about such a situation without putting it into any pragmatically relevant time or place.
For example, I could tell a story about something that happened many generations ago, or in another world far, far away, or that happened in a different world that exists in another dimension - in the end it’s just a story that has no immediate relevance to any decision that anyone in the audience needs to make about anything in their everyday life. But at the same time, it may still be beneficial for us to spend time thinking about such hypothetical situations, because it increases the scope of things that we have practiced thinking about.
But how could our ancestors use proto-music to refer to something outside of the immediate here-and-now?
The major problem to overcome was that proto-music lacked any means to specify the referent so that those listening to the speaker could determine what that referent might be, if it wasn’t something in the here-and-now.
The first stage in the evolution of referring to things outside of the here-and-now was probably to refer to things just slightly removed from the here-and-now. For example, in the recent past, or in the near future, or somewhere a small distance away from where we are now.
But even for a referent just slightly removed from the here-and-now, it would still be necessary for listeners to know that the referent wasn’t actually in the immediate here-and-now. The speaker would have to provide additional information that would give the listeners clues about what it was that the speaker was referring to.
Various means might have evolved for proto-music to include such additional information.
One of those means would have been the invention of words. That is, words first appeared as an enhancement to proto-music. In a situation where the referent of proto-music might be more than one thing, the speaker could use words to pinpoint which particular thing the referent was.
Once some means existed to be more specific about the referent, the possibility existed for speakers to speak proto-musical utterances where the referent wasn’t in the here-and-now, on the assumption that the listeners would be able to determine, or at least guess, with the help of the additional information, what the referent actually was.
Additional information provided about the referent would also allow listeners to determine how much psychological distance that referent had.
However, it is also possible that proto-music directly expressed the level of psychological distance.
This could happen if the level of psychological distance in the intended referent somehow modulated the expression of proto-music by the speaker. That is, the intention to express emotion about something involving psychological distance caused a general alteration in the brain state of the speaker, and this alteration had an effect on the generation of proto-music, and this effect was perceptible to the listeners.
Such modulation did not necessarily evolve - rather it might have been an incidental pre-existing side-effect of how psychological distance altered brain state.
Initially, the relationship between perceived modulation caused by the speaker’s internal brain state of psychological distance and actual temporal or spatial distance would be something that listeners could learn, provided that additional information was supplied by the speaker to identify the referent.
For example, if something happens, and then a short time later a speaker utters proto-music including words which identify that particular event, then the listeners will know the length of the time interval between the event and the speaker’s utterance, and they will also perceive the modulation of the speaker’s utterance due to the corresponding psychological distance, and they will be able to correlate those two things, ie the perceived psychological distance and the actual time interval.
Similarly, if a speaker refers to some situation which is in some other location and sufficient additional information is provided allowing listeners to infer the actual location and therefore the actual physical distance involved, then those listeners could learn to correlate the expressed psychological distance and the actual spatial distance.
Initially the relationship between the perceived modulation and actual distance in time or space would have been learned, but over time it could have evolved to become instinctive knowledge. (On the one hand everything is potentially learnable, on the other hand learning takes time and effort, so if something can be pre-learned, ie as an evolved instinct, that can save time and effort, in effect allowing the individual to “hit the ground running”. For some things it matters more to have the flexibility of a learning process, and for other things it’s more important for your brain to know straightaway how to process the relevant information.)
It is thus plausible that proto-music, by some means, expressed the level of psychological distance, and that listeners had the ability to perceive this expression of psychological distance.
But, returning to my main hypothesis as stated above, why would it be beneficial to generate a super-stimulus for such a perception?
Proto-Music gave birth to words, and then the child replaced its parent
Originally words evolved as an enhancement to proto-music. Proto-music expressed shared emotions, but it did not directly specify what the emotions were about. Words enabled speakers to provide additional information, enabling listeners to determine what an asserted emotion was about, in situations where there might be more than one possible referent. Words even made it possible for speakers to specify a referent which was not in the here-and-now.
On repetition, and efficiency
If proto-music was a system of asserting shared emotion among members of a group, it would be logical for the system to include some means of confirming the validity of this assertion. That is, a speaker’s proto-musical utterance says, “I think we should all feel this emotion about the situation”, but then the question naturally arises - does everyone listening actually agree with that asserted emotion?
One possible way to provide such confirmation would be for the listeners to repeat the same proto-musical utterance expressing the relevant emotion. Indeed the speaker could initiate the repetition of their own proto-musical utterance, and listeners would be able to “join in” to the repetition, to confirm their own agreement with the asserted emotion.
Because proto-music was often relevant to real-time situations, proto-musical utterances had to be fairly short, compared, for example, to what we (as modern humans) would consider to be a single melody that defines a modern musical item.
These utterances had to be as short as possible, because they were often uttered in situations where time was of the essence.
One aspect of even the simplest musical items is that of progression. Progression typically consists of a repetition of a single musical phrase, such that in some ways the repeated musical phrase is the same, but in at least one way it is different (for example, it might be positioned one or two notes higher on the musical scale). Typically a progression might consist of an initial musical phrase, followed by one, two or three variations, then followed by the initial phrase again, and so-on.
What we consider music (in the modern world) is not directly relevant to real-time decision making, so musical items can be as long as they need to be. Proto-musical items needed to be shorter, because they were about things happening in real-time in the here-and-now.
So it is plausible that progression as we know it in modern human music was not a feature of the original form of proto-music.
In which case a proto-musical utterance would have consisted of just a single proto-musical phrase repeated until it was determined that all members of the group agreed with the emotion thereby expressed (or not).
However, even though proto-musical “tunes” might have been shorter than modern musical tunes, proto-music was still a much less efficient form of language than modern human word-based language. With word-based language, the unit of meaning is usually a word, or sometimes part of a word, and typically only one or two syllables long. Whereas for proto-music, the unit of meaning was the full musical item, ie the approximate equivalent of a musical phrase in a modern musical item, which in most cases is much more than 2 syllables long (ie usually more like 6 or 8 syllables at the very least). Also, in proto-music, even if the melodies were fairly short, there was still the requirement that each melody had to be repeated, which of course increased the amount of time required to fully communicate that information. (Albeit the repetition included the act of listeners confirming their agreement with the expressed emotion, which counts as communication of additional information. But it was not a large amount of extra information - ie it was only one “yes” or “no” bit per listener.)
As well as being less efficient than word-based language, proto-music was also more limited, because it could only express certain kinds of meanings, that is, those involving shared emotions. Modern word-based language can be used to say things that relate to shared emotions, but it can also express many kinds of meaning that don’t involve shared emotions at all.
So it was likely, as the word-based component of language grew more sophisticated, that the proto-musical component of language became obsolete, because of its inefficiency, and because it was constrained to only say things that involved shared emotion.
As a result, proto-music eventually ceased to exist as a form of communication, and words took over completely.
How Proto-Music “pivoted” to a new function …
Proto-music was responsible for the birth of word-based language, but in the end the words became so powerful that the proto-music just got in the way, and it became obsolete, and as a form of communication it faded away.
But, something else happened. Proto-music as a form of communication went away, but a special type of proto-music continued to exist. And that special type of proto-music is what we now know as “music”.
But how did this special form of proto-music come into existence, and what function did it perform, if proto-music itself no longer had a useful function as a form of communication?
The critical defining feature of music, according to my hypothesis, is that it expresses maximal psychological distance.
As I have already explained, proto-music evolved into a form where it could express greater or lesser degrees of psychological distance, in relation to the referent of the shared emotion being expressed.
Music evolved as a form of proto-music that expressed the maximum possible degree of psychological distance. It expressed this by exaggerating to the maximum extent possible those modulations that occurred when a speaker of proto-music expressed psychological distance.
This resulted in listeners experiencing an intrinsic motivation to think about things with maximum possible psychological distance.
That is, it encouraged listeners to think about things as far removed as possible from the immediate here-and-now. It encouraged listeners to think about ideas and situations so far removed from their normal experience that those thoughts were just crazy fantasies.
The musical form of proto-music was no longer a means of communication. It was instead a form of mind alteration, one which altered the minds of listeners to think thoughts that they would not otherwise think.
This type of mind alteration probably played a fundamental role in the evolution of human thinking and human culture.
The Final Phase
However, this pivot from communication to mind alteration was not the last thing to happen in the evolutionary history of proto-music and music.
There was one last phase in the evolution of music.
The final phase in the evolution of music was that even this new function of music, the function of mind alteration to motivate thoughts with maximum psychological distance from the here-and-now, even this new function became obsolete.
Although music can still alter our minds in ways that encourage us to have thoughts that we might not otherwise think, music no longer exists as a primary motivation for thought processes involving things removed from the here-and-now.
The main reason for this obsolescence is that human society now provides other means for learning to think thoughts about things beyond the immediate here-and-now.
Most of those means are based on word-based language.
For example, most of the time it is not necessary for any individual to think new original crazy thoughts about things beyond the mundane here-and now, because most of the time someone else has already had those thoughts.
When you are part of a large society, or even connected loosely to a large society, there is no need to do all the thinking, because you already have access to many of the thoughts that have already been thought of by other people.
Of course not everyone shares every interesting thought that they have. But any time a person has an interesting thought, and shares that thought with other people who find it to be interesting, ie by stating the thought in words, then it will eventually spread to all of the society that the person lives in, and eventually to all other societies that are even slightly connected to that society.
An increase in the ability to share interesting ideas across large societies implied a decrease in the benefit derived from any individual making an effort to think of such ideas for themselves, and this implied a decrease in the benefit derived from any motivation to engage in such thinking.
But if that was the case, how come we are still creating and listening to music?
That is, if music once had a function as a form of mind alteration, and if this function is itself no longer important in the large societies that we now live in, why is it that music continues to even exist as a thing? Why hasn’t it just evolved away?
One possible explanation is that modern large human societies as we know them have not existed for that long. So-called “civilization” is not much older than 12000 years, with perhaps the earlier forms of more organized societies dating back to 20000 years ago at the most.
So the conditions that caused the mind-altering functionality of music to become obsolete have not been around that long, and music hasn’t yet evolved away because evolution is something that takes time. Music is in the process of evolving away, but that process is not yet complete.
(It is noteworthy that a significant minority of people - about 4% - are disinterested in music2, and this may reflect the beginning of the eventual full disappearance of music as a human trait.)