Jump to: navigation, search

Regarding performance: I'm after that since 1.5 years now since I've got a Cubietruck back then:!topic/cubieboard/tD0AxHx5Ync

I've never seen sequential write speeds exceeding 50 MB/s (all measurements reporting more were based on wrong methods) but read speeds are able to climb above 200 MB/s. SATA performance scales somewhat linearly with both CPU and DRAM clock and also some kernel config settings give an extra 3%-5% performance gain: CONFIG_SCHED_MC=y and CONFIG_SCHED_SMT=y

Have you ever tested CONFIG_SCHED_MC and CONFIG_SCHED_SMT individually/seperatetly? It doesn't seem logical that CONFIG_SCHED_SMT would make any difference for an A20 processor which doesn't do hyperthreading. -- silentcreek (talk) 11:25, 02 September 2015 (CEST)
I don't get the relationship with hyperthreading: -- I compared 9 months ago what differed between "Bananian" and "Armbian" since on Igor's images the second CPU core jumped in less frequently and network throughput was worse. After adjusting only these two parameters performance was on a par with Bananian's. -- Tkaiser (talk) 11:58, 3 September 2015 (CEST)
The SMT schedulter is designed for CPUs that do hyperthreading - or, to be more precise, for cores that are not fully independent. So, e.g. if you have 2 physical cores and 2 hyperthreading cores and you start a process with 2 threads, then it's best to let them run on your two physical cores and not on one physical and one hyperthreading core. That's what the SMT scheduler is aware of and optimized for. But the two cores on the A20 are fully independent, so CONFIG_MC should suffice. I know the kconfig decription text isn't really helpful here, but this article may explain the principles better: Btw, the patch you linked also shows that the relationship between cores is important for the SMT scheduler, namely "which ones have performance interdependency." -- silentcreek (talk) 00:56, 04 September 2015 (CEST)
Just to avoid misunderstandings... While I do give hyperthreading as an example, the SMT scheduler may be useful for other CPU designs as well, especially when cores are paired and share resources so they do not perform fully independently like AMDs module based design (e.g. Bulldozer). And of course the SMT scheduler works on classical multicore designs too, but but might come with additional overhead compared to SCHED_MC. An individual comparison might give more insights here, even though I'd assume the difference to be neglegible. But I never tested this, that's why I asked. Silentcreek (talk) 12:53, 4 September 2015 (CEST)
I didn't test both settings individually since i already spent way too much time on getting a clue why Bananian performed better compared to Igor's setup back then. Will keep that in mind for the next round of tests and report back. But this will take some time -- Tkaiser (talk) 12:58, 4 September 2015 (CEST)

I will verify these correlations within the next few weeks with 4.0 or 4.1 but would propose that it should read in the meantime: "SATA throughput is unbalanced for unknown reasons: With appropriate cpufreq settings it's possible to get sequential read speeds of +200 MB/s while write speeds retain at approx. 45 MB/s. This might be caused by buggy silicone or driver problems" Tkaiser (talk) 22:59, 21 June 2015 (CEST)

Done. One thing that comes to my mind: How does this compare to other ARM-based SATA implementations? (Investigate if this is a more general limitation, or a problem specific to the Allwinner platform.) -- NiteHawk (talk) 13:43, 22 June 2015 (CEST)
On i.MX6 one gets roughly 90 MB/s write and +100 MB/s read (without extensive tuning), Marvell's Kirkwood/Armada are known to provide similar or better results. It must be something special to Allwinner A10/A20, maybe a hardware quirk or a few lines of code in Allwinner's ahci driver that have never been touched? I'm too inexperienced at this level to dig deeper. Just have some knowledge/experience testing storage since I'm doing this partially for a living. -- Tkaiser (talk) 14:54, 22 June 2015 (CEST)