Alibaba group holding has introduced a computing pooling solution that it said led to an 82 per cent cut in the number of nvidia graphics processing units (gpus) needed to serve its artificial. Alibaba researchers devise efficient gpu pooling system, reducing gpu use 82% drastically reducing the amount of gpus needed for running ai models could have big consequences for the scale of huge data centers, while benefitting smaller organizations The chinese cloud champ therefore developed gpu pooling and memory management tech that means it can run more models on each gpu and offload data into a host’s memory or other storage
Charmane Star - Facts, Bio, Career, Net Worth | AidWiki
Alibaba cloud has introduced a gpu pooling technology that significantly cuts down the number of nvidia h20 units needed
The new system reportedly reduces the use of these gpus by 82% while managing multiple large language models with up to 72 billion parameters.