I've been called insane, but haven't been proven a liar...

Will the human race mostly abandon verbal communication in near future?

It is unclear when did humans invent a proper language. Pictograms and various glyphs were utilized since possibly before the bronze age, but it is thought that a proper language, a compact and convenient system of characters, corresponding sounds and a set of rules used to organize them into more complex structures, have originated somewhere around 2800 years ago.

That's way after the first civilizations' emergence.

Prior that, most civilizations have communicated through a system of pictograms - which have later evolved, supposedly to simplify expression on whatever written media were in circulation. Most of these pictograms expressed entire syllables or even words, before the Greeks came up with a short, convenient alphabet with a relatively short amount of symbols for every commonplace phoneme (which is usually made of one or two individual distinguishable sounds). Of course, an alphabet made of symbols assigned to entire syllables or words is going to be freakin' huge, as in case with modern Asian alphabets (Chinese, Japanese, Korean...) - and therefore, it'll be rather inconvenient to operate. So it's only natural that most human civilizations have then switched from communicating through semi-pictograms such as cuneiform into using a compact alphabet.

Why would humans abandon the use of language?..

Because they are going in this direction now.

In nearly every more or less popular messenger, there are at least several hundred emoticons available for use. Originally used as a means to express a user's emotion, now there are pictograms available for hundreds if not thousands of different objects.

The increasing usage of videos and pictures as memes of communication

Ever since the appearance of "free" video hosting services (for which you still pay with your data, though), as well as cheap cameras of all sorts, there has been a trend of slow transition from traditional, text-based media, to video-based communications. An increasing amount of people seems to prefer video-blogs over "traditional" written articles; even though videos take far more bandwidth, and are considerably harder to edit than a written article, more and more people - both content-creators and the audience - seem to prefer videos over articles.

Aside from videos, memes have gained considerable popularity as communication medium; the use of pictures in media is nothing new, but thanks to the availability of graphic editors of all kinds and the ease-of-use of them, as well as a large amount of "catchy" images of all sorts, it became fairly easy to communicate via these images. The presence of words in such memes is usually kept to a minimum, and the bulk of the concept is communicated via the graphic content, not the language which is used more as support in these cases.

While it is nothing unusual or unexpected, to see images and videos being used along with written language in media and communications, humans weren't expected to start reducing the role of words and language in general once the Internet becomes permissive enough to convey more than just letters.

Platforms such as Twitter/X, which intentionally limit the use of language for their users, have existed for more than a decade already. Generally, people often tend to communicate through short messages, barely concerning themselves with writing more than two or three sentences at a time; this isn't so true for memes, however, as people are known to post dozens of them in a short time, in a sequence.

As for the aforementioned videos, they allow to communicate more than just raw words. Emotions, gestures, and subliminal signals are recorded by a camera just alright, and so it becomes possible to convey, with little effort, what would normally require humans to sit down for a minute and find the right words to express. In some cases, content creators (and others who communicate through videos) do not even bother to refine their speech, so from time to time, one can see them stumbling during a narration, grunting a few "ughh"s or "aghh"s before they continue talking. And what's more interesting, even these grunts often mean something; it's not always the narrator struggling with thinking what to say next - it's just that they try to express this through gestures and emotions, before finding a suitable verbal expression for whatever they were going to say.

The use of pictograms in normal speech

So-called "emoticons", once being image representations of various smileyfaces traditionally drawn in text form, now literally include hundreds of various images representing various objects and entities. There are icons for not just hundreds of faces, but also for animals, trees, household items, cars and many, many more things; the usage of these pictograms in ordinary speech is rather small, but seems to be increasing.

Various messengers allow not only the use of those emoticons, but also various "stickers" which are (often animated) images of all sorts. The amount of such images also increases daily, and so does their usage. Instead of expressing a complex condition, one can simply enter a keyword or a few in a search bar, find a suitable image, and attach it to the message. Or use it as the message.

At this point, those "emoticons" are essentially forming an alphabet of their own. There are images for hundred emotions, various people, objects, and even flags and symbolics. Although far from replacing written language, their use still grows in communications.

Like with videos, the usage of pictograms is a tool to augment one's communication via the first signal system, that is, mimics, gestures, and other signals originating from the times when the distant human ancestors were still animals. Right now, they're mostly used as supporting structure; however, it is speculated that in some cases, the use of these pictograms might convey additional meaning to the message, enough to avoid using another sentence for additional explanation.

What could the human communication look like, after the partial phasing-out of language?

Language is part of the so-called "second signal system", which is symbolic communication: words, formulas, schemes etc. In contrast with the first signal system (the aforementioned mimics, gestures images, intonation, etc.), the second signal system requires a considerable conscious effort to utilize it to the full potential. Since humans' sapience is very questionable and unconfirmed, it is only natural that they struggle with the second signal system - while not being unable to use it at all, their vocabulary is generally somewhat limited, they often stutter and struggle while talking, or even become aggressive/violent in the middle of a speech. (for example, it's not unusual to see a human stop in the middle of their speech, start making grunt-like noises for some time, and then run some sort-of fall-back line like "that's just how things are". Sometimes they might become violent, which might range from a simple "hey, fuck you" to a full-blown brawl.)

Therefore, it is more convenient for human beings to explain themselves using imagery and videos.

As time will go, it is highly likely that the use of language will be "phased out" and heavily reduced (though unlikely that it'll be eliminated entirely); most words will be likely shortened to one syllable, rarely two - this can already be observed with some longer words in mainstream use. Written language will become more like a support tool, with most of the communication happening in pictures and memes. With the emergence of GAN mechanisms, it is possible there will be some sort of an AI designed to insert emoticons/stickers in place of written words for what it considers to be "visual gratification".

It could look like this. Notice how most of the actual written words are 4 or less characters long.

Using audio (and sometimes short video clips) is an option in many messengers, and since its emergence, an increasing amount of messages are conveyed in form of audio/video clips instead of typed text. At some point of time, the use of audio/video recordings might expand into nearly all areas of communications - including legal acts and contracts. It might be possible to strike a deal via an audio/video recording only. As for passwords and security in general, there are already non-symbolic keys and even biometric security systems installed in many modern devices.

Individual words will likely reduced in length for convenience; we already have some instances of this, (for example, "infosec" means "information security"; "vegetarians" got shortened to "vegans", etc.) but in future, most of the words in common use might get shortened down to two-to-five letters, and multi-word phrases might be fused into one short word, or even into an acronym.

The amount of words in common use will likely reduce, too. Long words and phrases will get eliminated from mainstream communications due to inconvenience of use, and so will synonyms; in order to reduce the mental burden on commoners, the use of words will be mostly replaced with more understandable pictures, videos, sounds and memes, with which they'll be adequately able to express whatever limited concepts they have. With the increase of usage of machines in various industries and enterprises, and the ever-increasing burden laid on them, it'll become possible to simplify learning courses, too; most human professions and occupations will be adequately taught via picture books and possibly simplified schemes with limited usage of actual language.

With the use of aforementioned GANs, it is possible that text messages might be phased-out altogether. Instead, the message input will be processed by a GAN, not unlike modern "AI artists" generate a picture using a combination of words as an instruction; the generative-adversary network will then analyze the text/pictogram input, and convert it into a short animation roughly representing the message sent. The sender will commit a text/pictogram message, while the recipient will get a short animation clip this way.

As for the spoken language, it most likely won't vanish - well, we do need the means for sonic representation of those pictures and animations - however, it'll most likely suffer the same changes as written language: complexity reduction. Most mainstream humans will communicate in short words, one or two syllables long; these words will be intermingled with all kinds of non-verbal sounds, which will serve the purpose of conveying context via the first signal system. Characteristics of speech such as tone, speed, cadence etc. will likely become more significant than the words themselves.

What about science? High-level research? What about math, which is entirely made of symbols, schemes, and graphs of all sorts?

While pictography may (and likely will) affect research and science, the use of language and schemes will, most likely, retain in these fields, and largely unchanged.

Unlike normal conversations, most of the scientific efforts require extra precision compared to the normal conversations and publications. Unlike in "generic" conversations, when one can express their thoughts and opinions freely, the presence of potential errors, uncertainties and misinterpretations at any stage of any particular research may result in an incorrect conclusion, which, when used in another research or any other thinking process, is very likely to lead to derivative incorrect conclusions, and so on. Which easily leads to a flawed perceptive model of the state-of-affairs in question.

Referring to the last image above this line, it is kind-of hard to say exactly what did the sender mean by their message. Maybe their recipient did actually decipher this message, and maybe not. But from an outsider's point of view, it might still look confusing. (Even though I've got the general idea of that message, I still can't say precisely what exactly does each and every emoticon mean there.)

Unlike with pictograms and other first-order signals, words express concepts much more precisely. For example, when you want to say, "a potentially-habitable shallow-ocean moon orbiting an ice giant around an A-class main-sequence star", you just can't replace any of the words with the moon/sun emoticon and preserve the original meaning. Sure, a commoner may understand this as "an alien moon which you can possibly live on!", and there is a way to express this in a word-pictogram mix, this will not be even nearly enough for a scientist.

In other words, once we need precision, pictograms and memes are just not going to do it.

So, to a significant enough degree, the use of language for research/scientific purposes, as well as where a detailed and/or precise enough expression is needed, more or less ordinary language will be used.

Conclusion

Based on whatever experience/observations of human communications I have, language is, indeed, a cumbersome and inconvenient tool for the masses.

Showing possession of limited vocabulary, numerous grammar errors such as misspelling and mal-use of punctuation (sometimes critical for context, such as confusing between "your" and "you're"), as well as observed inability to explain or express an existence or a state of affairs using words quickly enough in a conversation, when they're forced to retort to so-called "escape sequences" - humans in their observed majority are not good at handling language.

They, however, do not encounter the same issues while communicating through other media: videos, memes, pictures, pictograms, etc.. And since modern technologies allow for efficient communications through non-verbal or semi-verbal means - or at least present a visible opportunity to implement that - there is no reason to think that language will continue on as the main means of communication in mainstream societies. Especially when most members of said mainstream societies strive for maximum comfort, sometimes even at the expense of own liberties or security.

Language will remain, but it will be a means of precise, or detailed expression. From the mainstream, it will likely mostly disappear.