This work is based on the Edge.org 2017 annual question “Which Scientific Concept Should be More Well Known?”, where a range of eminent scientists and science writers contribute short pieces nominating a concept. Text analytics was used to visualise the concepts, with the hope of finding related ideas and looking at interdisciplinary links between the concepts nominated by the scientists who were polled.
Jean Miélot, science in the middle ages (Wikicommons)
The 205 contributed scientific concepts were downloaded and organised by topic and descriptive text. They were also allocated to a discipline, based on the bio of the contributing expert on edge.org. This gave the breakdown shown in Table 1, below.
Respondents to the Edge question seemed to give two types of answer: those related to the scientific method itself, and those derived from different scientific fields. This revealed some ambiguity in the question. It also seemed to be the case that respondents sometimes cited concepts from fields other than their own.
no | discipline | ideas |
---|---|---|
1 | Psychology | 36 |
2 | Physics | 31 |
3 | Computer Science | 22 |
4 | Science General | 21 |
5 | Biology | 14 |
6 | Philosophy | 12 |
7 | Neuroscience | 10 |
8 | Cognitive Science | 9 |
9 | Mathematics | 5 |
10 | Medicine | 5 |
Using TidyText text mining methods in R (Silge and Robinson, 2016) we can generate a similarity diagram based on the TF-IDF of the key terms in the Edge concepts. The diagram below filters with a correlation of 0.075. The discipline colouring is not too useful (these might be better classified into a smaller grouping). But with some effort we can see areas where disciplinary ideas are cliquey / clustered, such as:
- Maladaptation, somatic evolution and simplistic disease progression in medicine;
- The principle of least action, the big bang and parallel universes in physics;
..but also some cross disciplinary linkages, including:
- criticality from physics and positive feedbacks in climate change (environmental science);
- possibility space in maths and alternative possibilities (scientific method / psychology);
Of course, without further refinement this technique and the examples above are not necessarily distinguishing entries with shared language from ideas with truly related overarching concepts.
[Zoomed View](https://paulusm-blog-pix.s3-eu-west-1.amazonaws.com/edge-network.png)
Lexical similarity is only one way of linking the entries. It might be that similar ideas from different disciplines with less shared vocabulary can also be identified using synonyms and hyperonyms (Barzilay & McKeown, 2001). This will be the next stage of the analysis. As Hofstadter observed, ideas can be like musical chords:
superficially similar ideas are often not deeply related, and deeply related ideas are often superficially disparate..ideas that share a conceptual skeleton resonate in a sort of conceptual analogue to harmony (Hofstadter, 1989)
References
Barzilay R, McKeown KR. (2001). Extracting paraphrases from a parallel corpus. https://doi.org/10.3115/1073012.1073020
Hofstadter D. R. (1989). Gödel, Escher, Bach: an eternal golden braid. New York, Vintage Books.
Silge, J. & Robinson, D. (2016). Tidytext: Text Mining and Analysis Using Tidy Data Principles in R. JOSS 1 (3). The Open Journal. https://doi.org/10.21105/joss.00037.