In September we wrapped up a focused release of v0.1 of the SenseNets app to around a dozen alpha testers. Our intrepid testers were mainly academic researchers who were already active on science social media and interested in open science and novel publication methods.
SenseNets v0.1 release: AI nanopublishing assistant
This release was centered around the experience of converting social media posts to nanopublications: users could connect their Twitter accounts, and we processed their posts with AI to (1) flag potential nanopublications and (2) add metadata to make the posts easier to find. Users could then review and edit the drafts, and finally nanopublish the posts they wanted to. We also enabled automated nanopublishing for users that preferred to skip manual review.
You can see a short demo of this version here:
What makes a social media post nanopublication-worthy?
One of the key questions that comes up around nanopublishing is: “what makes a science social media post a potential nanopublication?” Since nanopublishing is such a new practice, there actually aren’t really many established norms yet. As detailed by the Nanopublications webpage, “a nanopublication can be about anything, for example a relation between a gene and a disease or an opinion.” We similarly adopted an expansive view of nanopublishing, treating any posts even remotely research-related as valid potential nanopublications. Nanopublications are part of the Semantic Web, of which one of the core pillars is the “AAA principle”: Anyone can say Anything about Any topic.
In practice, for automated classification of posts as potential nanopublications we decided to use two indicators, where the presence of either means we classify the post as a potential nanopublication.
Reference-type: checks if the post contains any mentions of academic references (e.g., references that have a DOI or other scholarly identifier, like preprints or journal papers).
LLM-based: classifies the post based on content. This indicator is less precise than (1) but helps detect a wider variety of research related content. For example, this indicator can detect a post announcing a informal blog discussing a new machine learning method.
In the next section, we'll discuss our findings and how our experiment played out in practice. For more technical details about our nanopublishing pipeline, check out the appendix section at the end.
What did we learn?
Researchers are curious about nanopublishing
While most of our alpha testers hadn't heard of nanopublishing, we observed a lot of curiousity about the idea. Researchers feel that a lot of valuable research-related knowledge goes unrecorded or unrecognized, and that this knowledge often takes the form of short claims, observations or assertions. We encountered similar curiosity about nanopublishing more broadly, also outside our focused alpha testers group. As just one example, a chance meeting with an astrophysicist in a meetup in Boston resulted in me being invited to present SenseNets and nanopublications at the Harvard Center for Astrophysics - they know social media is relevant for them as scientists but are also curious about how it could be improved through new practices.
Researchers still aren't sure what to nanopublish
That said, the novelty of nanopublications is also a source of confusion. We found that nanopublishing means a lot of different things to different people. Our working definition of nanopublications has been any research-related social media post, as described above. But some of our testers took a stricter view, and saw nanopublications as more like micropublications - an intentional kind of microblogging around research. It was not obvious to testers that merely the statement that one has read a particular paper could be a nanopublication. A recurring theme was that nanopublishing involved more friction than merely posting about research on social media. One reason for this impression is simply the name (publishing vs posting) and another reason is that nanopublications behave more like publications in the sense that they can only be retracted, not completely deleted (like a post can).
Better incentives are needed for nanopublishing
Nanopublications' novelty also means there still aren't enough incentives for nanopublishing. We hoped that support for Open Science practices and "data altruism" would prove motivating enough for these early stages, but our testers made it clear that nanopublishing has to be more useful for them to invest time in doing it. For example, being able to view how one's nanopublications are being received, or being able to use nanopublications for new kinds of discovery experiences. One idea for a discovery service that a lot of researchers were particularly excited about is a "science only" feed.
Nanopublications inspire researchers to think of new ways of sharing information
One of the reasons we continue to be excited by nanopublishing is their capacity to inspire researchers to think of new ways to share their knowledge and connect with their peers. Multiple researchers reported that they started seeing their social media posts "with fresh eyes", as nanopublications. One researcher told us they are seeing how nanopublications can transform science social media into the leading edge of academic work in a space.
Some researchers had ideas around using nanopublications beyond science social media, for example to record citations in conference presentations.
Another surprising related finding was that some researchers were excited by the idea of using nanopublications instead of science social media. These researchers did not like to participate in social media, but felt like they had valuable knowledge to share with their research communities. They saw nanopublications as a new way to convey this knowledge.
Next steps
To summarize, this first experiment was focused on making nanopublishing easy, through the use of an AI nanopublishing assistant. Though limited in scope, the v0.1 release already taught us a lot about the current limitations and promise of nanopublishing. Drawing on our learnings, our next experiment will focus on further reducing the friction around nanopublishing, and also making nanopublications more useful and engaging. We came away with some exciting ideas for how to do that, and aim to release an updated version of the app in the coming weeks - stay tuned!
Special thanks to our designer Andrea Farias for user research and the demo video, and thanks again to our alpha testers for providing invaluable insights and feedback!
Appendix: unpacking the nanopublishing pipeline
The diagram below describes our nanopublishing pipeline:
The platform watcher service automatically fetches new posts as the user creates them.
Posts that are part of a longer thread are merged, and mentioned URLs (references) are extracted and normalized.
We use the Citoid API to extract reference metadata, including the type of content (blog post, journal article, etc) and citation information like DOIs.
Posts are then processed with our NLP module which is composed of multiple Langchain modules.
A research detection module flags posts as potential nanopublications using a combination of reference metadata and AI, as described above.
The keywords module extracts keywords related to the content and references.
The relation tagger labels the semantic relations between the posts and the references it mentions using our Common SenseMaking Ontology (CoSMO). CoSMO draws on a number of related ontologies including the Citation Typing Ontology CiTO and Schema.org. CoSMO also includes new concepts specific to science social media, such as
asksQuestionAbout
orsummarizes
.
To nanopublish, we used the nanopub-js library