It depends on what "academic achievement" is meant to mean. Every psychologist who's studied the macro side of cognition (what it means to know, learn, and think) will tell you that it is simply not possible to do any high-quality testing using paper and pencil. Most good, experienced teachers have a better idea of the state of their class's learning just by daily observation than any standardised paper-and-pencil test can yield.
Rote learning (cut-and-dried facts and rules, the least-important kind of learning in Bloom's and Scrivens's taxonomies) can be tested with paper and pencil. Higher-order learning? Hard to do it the lower levels, and impossible at the top.
Everyone learns at different rates and in different ways. Standardised tests can't accommodate those differences, whereas teachers can because they're not trying to generate fine-grain data.
The smarter someone is, the more ambiguity they find in most standard tests because the test writers simply can't deal with such individuals. They (try to) use phrasing that MOST test-takers will read a certain way. But this derails people who aren't under that center hump in the curve.
That's why our Methodology profs told us that creating tests and surveys is the most difficult possible kind of work, and that almost everyone produces own-goal items, sometimes quite often. Even the people who create the NORC surveys, the world's gold standard, write at least one or two, if not a dozen, bad questions every time. And they routinely draw on all the academic expertise in the US and Canada!
The best testing is a series of subject-matter interviews by people who are subject-matter experts who are good at interviewing. They can get at high-order learning. But machine-scored paper-and-pencil tests? Not worth the effort except perhaps for admission to university, and even there the more exclusive schools rely on the interview.