https://www.theregister.com/2024/10/02/china_telecom_model_trained_local_tech/
China Telcom's AI Research Institute claims it trained a 100-billion-parameter model using only domestically produced computing power – a feat that suggests Middle Kingdom entities aren't colossally perturbed by sanctions that stifle exports of Western tech to the country.
The model is called TeleChat2-115B and, according to a GitHub update posted on September 20, was "trained entirely with domestic computing power and open sourced."
"The open source TeleChat2-115B model is trained using 10 trillion tokens of high-quality Chinese and English corpus," the project's GitHub page states.
The page also contains a hint about how China Telecom may have trained the model, in a mention of compatibility with the "Ascend Atlas 800T A2 training server" – a Huawei product listed as supporting the Kunpeng 920 7265 or Kunpeng 920 5250 processors, respectively running 64 cores at 3.0GHz and 48 cores at 2.6GHz.
Huawei builds those processors using the Arm 8.2 architecture and bills them as produced with a 7nm process.
At 100 billion parameters, TeleChat2 trails the likes of recent Llama models that apparently top 400 billion parameters, or Open AI's o1 which has been guesstimated to have been trained with 200 billion parameters. While paramater count alone doesn't determine a model's power or utility, the low-ish paramater count suggests training TeleChat2 would likley have required less computing power than was needed for other projects.
Which may be why we can't find a mention of a GPU – although the Ascend training server has a very modest one to drive a display at 1920 × 1080 at 60Hz with 16 million colors.
It therefore looks like the infrastructure used to train this model was not at parity with the kind of rigs available outside China, suggesting that tech export sanctions aren't preventing the Middle Kingdom from pursuing its AI ambitions.
Or that it can deliver in other ways, such as China Telecom's enormous scale. The carrier has revenue of over $70 billion, drawn from its provision of over half a billion wired and wireless subscriptions. It's also one of the biggest users and promoters of OpenStack. Even without access to the latest and greatest AI hardware, China Telecom can muster plenty of power. ®
China trains 100-billion-parameter AI model on home grown infrastructure
Research institute seems to have found Huawei to do it – perhaps with Arm cores - despite sanctions
Wed 2 Oct 2024 // 02:05 UTCChina Telcom's AI Research Institute claims it trained a 100-billion-parameter model using only domestically produced computing power – a feat that suggests Middle Kingdom entities aren't colossally perturbed by sanctions that stifle exports of Western tech to the country.
The model is called TeleChat2-115B and, according to a GitHub update posted on September 20, was "trained entirely with domestic computing power and open sourced."
"The open source TeleChat2-115B model is trained using 10 trillion tokens of high-quality Chinese and English corpus," the project's GitHub page states.
The page also contains a hint about how China Telecom may have trained the model, in a mention of compatibility with the "Ascend Atlas 800T A2 training server" – a Huawei product listed as supporting the Kunpeng 920 7265 or Kunpeng 920 5250 processors, respectively running 64 cores at 3.0GHz and 48 cores at 2.6GHz.
Huawei builds those processors using the Arm 8.2 architecture and bills them as produced with a 7nm process.
At 100 billion parameters, TeleChat2 trails the likes of recent Llama models that apparently top 400 billion parameters, or Open AI's o1 which has been guesstimated to have been trained with 200 billion parameters. While paramater count alone doesn't determine a model's power or utility, the low-ish paramater count suggests training TeleChat2 would likley have required less computing power than was needed for other projects.
Which may be why we can't find a mention of a GPU – although the Ascend training server has a very modest one to drive a display at 1920 × 1080 at 60Hz with 16 million colors.
It therefore looks like the infrastructure used to train this model was not at parity with the kind of rigs available outside China, suggesting that tech export sanctions aren't preventing the Middle Kingdom from pursuing its AI ambitions.
Or that it can deliver in other ways, such as China Telecom's enormous scale. The carrier has revenue of over $70 billion, drawn from its provision of over half a billion wired and wireless subscriptions. It's also one of the biggest users and promoters of OpenStack. Even without access to the latest and greatest AI hardware, China Telecom can muster plenty of power. ®