【統計】(マナー集)統計で転ばぬ先の杖([Statistics] A stick to keep you from falling with statistics)

帯に謳う「それでいいの? もやもやする統計分析があふれていると思いませんか」にドキリとするマナー集。(?の後ろの空白に、著者が日本語教育学者であることを感じる。のは、私が校閲経験者だからか)

【書籍情報】「統計で転ばぬ先の杖」著者:島田めぐみ 、野口裕之/発行:ひつじ書房/発売日:2021/3/31

「転ばぬ先の杖」英語表現がうまくできず直訳になってしまった。ことわざって難しい。


本の特徴

  • 学術論文に載せてある統計分析の基本的誤りを指摘
  • 統計初心者が陥りがちな表現を実例とともに解説
  • ウェブマガジンの連載に加筆して書籍化


感想

  1. 論文内「グラフの大量生産」に警鐘
  2. 「統計を学ぶ」ではなく「記述のマナー」に特化
  3. なんとなく統計手法使ってるドヤ勢に氷剣刺すクールな先生


1.私が唯一書いた論文は、学生時代の卒論です。それも先輩の書いたやつの数値を入れ替えただけのダメ学生でした。卒研部屋の先生には頭が上がりません。ひゃくちゃん…すみませんでした。よく卒業できたな。

この本にも学生たちの論文例が出てきます。耳が痛いご指摘が多く…自分を顧みて庇うとすれば、20代の文系学生に統計知識があるだろうか、表現まで考慮できるだろうか。となると、世に出す前に論文をチェックする先生たちにぜひこの本を読んでいただきたい。


2.そもそも図いる?ラベルつけてる?連番は?スケールは?アメリカ心理学会(APA)の公式マニュアルを参考に出してくれているので説得力あります。書いてあることはグラフ可視化における基礎なんですが、できてない実例を示してくれるので衝撃でした。個人的には「罫線は少なく。適切なスペースが代わりになる」が好き。もう縦罫いらないんです。。でも私はきっと表側に罫線つけちゃう…。


3.統計ソフトの出力のまま出さない、小数点以下むやみに多く書かない、有意差がないのも結果です。t検定は何度もしません。χ2検定はクロス表ですよ。有意傾向ってp値基準ふたつも提示してややこしいでしょ。

ごもっともです。はい。気を付けます。と読みながら何度も謝りたい気分になりました。論文書く機会はないけど、BIツールで統計分析を示すときにまた読み直そう。


うーん、しかし顧客にとって統計が身近じゃなかったら、意味すら伝わらないから…有意って、一意とは、から説明するのは現実的じゃない。注釈に入れてもかさばるし。そもそもアナリストの報告書って、ともすると「これ最新手法、この視点おもろくない?」と言いたい自己満で長くなりがちなので、短く簡潔にまとめようね、賢い人は多くを語らないんだぞ。と自戒している。

ただBIツールは可視化が簡単にできるからこそ、細部はマナーを守りたい。わかる人には信頼してもらえる…はず。


*******


"Is that OK? Don't you think there's a lot of vague statistical analysis?" This is a collection of etiquette that will make you break out in a cold sweat. (I feel that the space after the "?" tells me that the author is a person educating Japanese language.)

[Book Information] "A stick to keep you from falling with statistics" Author: Megumi Shimada, Hiroyuki Noguchi / Publisher: Hitsuji Shobo / Publication date: 31 March 2021

"A stick to keep you from falling." I couldn't express it properly in English, so I ended up using a literal translation. Proverbs are difficult.


Features of the book

  • Identifies basic errors in statistical analysis published in academic papers
  • Expressions that beginners in statistics often fall into are explained with examples.
  • Added to the web magazine series and published as a book.


Thoughts

  1. Paper warns against "mass production of graphs".
  2. Specialising in "descriptive etiquette" rather than "learning statistics".
  3. Stabs an ice sword into a paper that uses a lot of statistics with a shallow understanding.


1. The only paper I wrote was my graduation thesis when I was a student. Moreover, I was a bad student who changed the numbers in the paper written by my senior. I can't stand the teacher in the graduation research room.

This book also includes sample papers written by students. There are many points that are painful to hear. However, do liberal arts students around the age of 20 have statistical knowledge? Can they check every detail? In that case, I would like all teachers who check papers before releasing them to the world to read this book.


2. Is a diagram necessary for this part? What's the label? What is the serial number? What's the scale?

It is convincing because it is based on the official manual of the American Psychological Association (APA). These are all basics of graph visualization, but they show bad examples. Personally, I like "fewer borders. Adequate spaces will take their place." I don't need vertical lines anymore... But I'm sure I'll put a line on the the side...


3. Do not output the raw output of statistical software. Enter as many decimals as you need. Please say that the fact that there is no significant difference is also a result. Don't repeat the t-test. The χ2 test comes from a crosstab. It's confusing to have two p-value criteria for a significant trend, isn't it?

It makes sense, yes. I will be careful. As I was reading this, I felt like apologising over and over again. I haven't had a chance to write a paper yet, but I'd read it again when I need to demonstrate statistical analysis using BI tools.


Hmm, but if the customer doesn't know the statistics, they won't even be able to convey the meaning... It's not realistic to explain what's significant and what's unique. Even if you put it in the notes, it would be too long.

Analyst reports tend to be long because of the self-satisfaction that makes you want to say: ``This is the latest method, this perspective is interesting, right? So let's keep it short and sweet. Smart people don't talk much. I admonish myself.

Because BI tools allow easy visualisation, I think it is important to be polite when it comes to details.