Measuring, Reviewing and Reflecting

This week was our final session for DITA. As part of this session we looked at various ways in which we could measure and report our impact. In this post I will take a slightly more reflective approach and discuss a few of the outcomes found in the data and what I will personally try and do with them.

What did I find?

Rather than discussing the process of getting the data I thought I would skip straight to some of the outputs. This should allow some space to discuss what conclusions can be drawn from these outputs.

Twitter reports

Using a spreadsheet from the Tate, I exported my twitter data for October and November (there isn’t much data for December). This produced the following overview

It became clear I haven’t kept up with my Tweeting as I intended to at the start of this module. It has been fairly sporadic and my assumption that it was mainly to link to my blog posts is confirmed with a Wordcloud produced by voyant

tweetingvisualisation

The above wordcloud confirms that most of my blogging was related to promoting my blog.

What conclusions can be drawn from this?

I had intended to start using twitter more throughout the course but my use has been haphazard at best. I feel my blogging has been slightly more successful (in terms of output at least, quality can always be improved!). I think this is largely down to the tone of my blog being more familiar for me then that of Tweeting. On the other hand I have actively followed a lot of other people on twitter. It has been especially useful for finding resources and information shared by other people and under LIS and Open Access hashtags.

I’m aiming to make a resolution of sorts to become more active on Twitter, especially in sharing resources, events etc. I have really enjoyed the blogging part of the DITA course. It has been a good way to reflect on the content of labs and lectures and forces me to think about the issues being dealt with. I’m hoping to keep up with blogging in some form once DITA finishes. I will probably explore other options for blog hosting but intend to keep blogging in some form. One idea I am toying with is a blog which reflects on the process of writing my dissertation. I think this could be a good way of keeping myself accountable, allows me to get feedback and share resources and ideas I have.

As I final note I was to thank Ernesto for organising such an interesting and engaging course and to my fellow #Citylis students for their insights and ideas along the way. It has been a pleasure!

Utrecht University Digital Humanities Lab Text Mining Research Projects

Circulation of Knowledge and Learned Practices in the 17th-century Dutch Republic

Continued from part one

This post will looks at one of the Utrecht University Digital Humanities projects: ‘Circulation of Knowledge and Learned Practices in the 17th-century Dutch Republic’. Through outlining some of the aims and methodologies of the project I will try and assess some of the potential strengths and weaknesses of such a project.

What is the project about?

The project seeks to address the question of how discoveries of the 17th century scientific revolution were communicated. One of the primary ways in which this was done was through letters. At this point academic journals had not become the dominant form of communicating knowledge. The letters between different scientists were a primary means of communicating new knowledge. The project therefore sets out to understand how thee letters were used through building a web application:

In order to answer this question the CKCC project built a web application called ePistolarium. With this application researchers can browse and analyze around 20,000 letters that were written by and sent to 17th century scholars who lived in the Dutch Republic. Moreover, the ePistolarium enables visualizations of geographical, time-based, social network and co-citation inquiries.Utrecht Digital Humanities project descriptions

This is a question which has some obvious interest for librarians. In its focus the project asks questions about the movement of knowledge across networks and seeks to understand this through the use of different visualisations.

Aim of the project

The project outlines a number of different aims beyond addressing its questions about the use of letters during the 17th century:

‘One of the main targets of this project is to create free, online access to historical sources, open to researchers from various disciplines all over the world. Unlike earlier printed editions of correspondences, this source is not static but of a highly dynamic nature, making it possible for the scholarly community to discuss problems, share information and add transcriptions or footnotes. One of the great advantages of this project is its ability to invoke new questions, new interpretations and new information, and to bring all this together on an expanding website.’
Project Description

The project doesn’t intend to address a single historical question, or set of questions, as would often be the case with academic historical research. One of the potential advantages of projects like this is that the process of text mining and application development can also open up the possibility of raising, and potentially, addressing new research questions. This is especially true if the data mined in a project, and the tools used are made openly available.

One of the potential difficulties for other scholars wanting to use material from this project is that, as far as I could see, it doesn’t make the data openly available, nor does it provide source code for ‘ePistolarium’ (the tool developed for the project). The project outlines some of the technical difficulties they overcame in using the letters. In particular the structure of the letters needed to be edited before they could be imported into a database. This work would likely have to be repeated if the letters were to be used in a different type of project. Even if some additional work had to be done to this data in subsequent projects, once they are in a computer readable format this work could be done much more quickly then starting from scratch with the original letters.

Outcome of the project

I spent a little time exploring the ePistolarium app using a test search for ‘Wetenschap'(science). The visualisations themeselves were interesting to view and it seems to present a quick way to visualise some of the ways in which these letters were communicated. I’ve included some screen shots below:

  1. Searched for ‘Wetenschap’ (science in Dutch)
    wetenschap
  2. Explore different visualisations:
    • Map view showing links between ‘nodes’. Makes use of Google Maps API
    • Timeline: results over time.
      Screen Shot 2014-11-24 at 12.30.26
    • Correspondent Network: who is talking to whom
      Screen Shot 2014-11-24 at 12.28.42
    • Co-citation map: who is mentioned the most in the letters
  3. Viewing text results.
    Screen Shot 2014-11-24 at 12.31.58
  4. Exporting text results as CSV
    Screen Shot 2014-11-24 at 12.41.04

One of the nice features of the app was that results could be exported as a CSV file which could then be analysed by other software. As far as I could see there are not yet any publications available that outline some of the results of the project rather than outlining the methodology. I would be very keen to read papers that make use of this research to see what it might allow historians to do with it. I think there can be a danger with these sorts of projects that the tools become an output in themselves, and although these can be interesting to explore there could be limits to the sort of outputs it can generate. These don’t necessarily have to be academic papers but something which helps ‘draw’ conclusions from the tools rather than leaving individual users to explore the tool for themselves only.

Text mining the Old Bailey online

The Old Bailey Online

Part two here

I have encountered the Old Bailey online a number of times before so know my way around the site quite well. This week our focus was on exploring how the sites API could allow us to mine the data from the site to use with other tools, in particular Voyant. I will briefly cover how this worked with some screen shots in this post and will spend more time in a longer second post discussing the ‘Circulation of Knowledge and Learned Practices in the 17th-century Dutch Republic’ project at Utrecht University.

I was curious to indulge my interest in ‘witchcraft’, so initially searched for this as a keyword in the main search section:

witchessearch

Following this I turned to the Old Bailey API Demonstrator. This allows the possibility to export results to be used by other tools. I used this to export results to voyant discussed in last weeks post. The data from a search for ‘witches’ as a keyword was exported to Voyant. We can now use the tools available on Voyant to further interrogate the data of the old bailey in new ways. This one of the fantastic things about providing an API. It wasn’t necessary for the Old Bailey online to provide all of the tools of Voyant themselves. They could instead provide the underlying data in a way that allows other websites to make use of it. This can avoid the duplication of effort but also opens up the possibility of your data being used in new ways which hadn’t initially been envisaged.

Screen Shot 2014-11-24 at 12.03.16
The output of the old bailey viewed in Voyant.

The next post will look in more detail at one of the Utrecht Digital Humanities projects. The process of exporting the Old Bailey results to Voyant was very straightforward. The Old Bailey online provides a link to Voyant, so it is clear that some thought has gone into combing these two tools. This is not always the case. Sometimes data will be provided in ways which make it less easy to use. I will discuss this in more depth in part two but there are some exclent discussions of ways of dealing with this on the programming historian which is all round a great resource which I’m aiming to discuss in a separate blog about ‘geeky’ stuff!

Text Analysis Tools

Close reading

I have to start with a confession. I am probably one of those people that might have some hesitancy to a move away from ‘close readings’. As someone who has spent over two years reading a single book (I need some better hobbies), I think there is a lot to be said for a slow, careful reading of a text. In the case of non-fiction, I believe that most writers want to present an argument or an idea in their writing. That as a reader we can try and understand that argument and in doing so evaluate whether whether the arguments made support the ideas/thesis of the text. Within fiction there are different ‘levels’ at which we can engage with a text. We could read for pleasure, to try and understand how the text fits into other texts of the genre/period, to assess particular linguistic techniques. What all of these have in common is that they require an active engagement from the reader.

With this is in mind, I approached the use of ‘textual analysis’ tools with a little bit of wariness. I think I can be persuaded of their usefulness albeit with a disclaimer about the way and situations in which they are used. I also think that some of these tools can lend themselves to misuse in a way that adds no new clarity to a text (although this could be equally true of ‘close readings’).

I thought it made more sense to focus on one feature of one of the platforms we used, to assess how I think it could be useful rather than try and cover all the different features of the different text analysis tools out there. Because I have a particular aversion to ‘word clouds’ 1 (I agree with a lot of what Jacob Harris says on the topic), I thought I would instead look at a different tool and see if this could persuade me of some of the usefulness of text analysis tools.

Voyant’s ‘trends’ feature

I found the features of Voyant more useful then the other tools because of some the additional options it had, in particular the ‘trends tool’.

As an example, I used a data set of #citylis tweets archived using TAGS (see also here and here ). The first step once the text is uploaded on Voyant, which is pretty self explanatory, was to take a quick look at an overview of the results. This included taking a look at the dreaded ‘word cloud’! The next step was to exclude some of the ‘stop words’. These are words that the text analysis should exclude. This included both general english language words ‘and’, ‘the’ etc, and ones related tweeting; ‘rt’ etc. It was also necessary to exclude #citylis itself, as this was the ‘tag’ for the data being analysed but was not actually something we wanted to include in the text analysis. Two of the most common words left after stop words were excluded are ‘#bl’ and ‘Labs’.

Screen Shot 2014-11-17 at 12.16.49

One of these #bl, refers to the British Library, the other Labs, refers to the the [British Library labs] (http://labs.bl.uk). With the tools of Voyant it was possible to visualise the trend in use of these two phrases.

Screen Shot 2014-11-17 at 12.15.46

From this image is it possible to see that these two words occur frequently together. It is therefore impossible to make an inference that these two words might have some relationship. In this example the two words ‘#bl’ and ‘labs’, reefers to an event about the British Library labs.

In this case I did have a head start, I knew that these two words had a relationship so could guess that they would have some relationship. This isn’t ‘cheating’ as such. It is likely that Digital Humanities scholars making use of tools like Voyant will come with an idea, question or assumption about a piece of text(s) that they want to ‘test’. I believe it is in this way that text mining could be useful. It may also serve as a first means of approaching a text, getting an ‘overview’ and coming up with some questions and ideas. In this way ‘text mining’ could serve as a useful new tool. However, in order for the tool to be useful it still requires careful thinking on the part of the user.

A useful tool, if used carefully?

If the ‘reader’ remains central in posing questions, thinking about the text and how to use textual analysis tools, then I believe these tools could be very useful. There is a danger that familiar ways of using textual analysis tools emerge and that the only change made is the text inputed into the tool. What is more exciting is the possibilities that these tools could offer to people asking interesting questions about texts first, and then thinking of ways in which existing, or new, tools could help them address these questions. It is also very possible that these tools help to come up with new questions. For this to work the reader of texts will remain central. Perhaps the difference between a ‘close reading’ and textual analysis tools are not so great?


  1. I think it is fine to use ‘word clouds’ as ‘decoration’. They can also give some superficial insight into some text (or perhaps even deep insights in certain cases). However, I think often the frequent appearance of a ‘word’ is presented as meaningful in itself. I would suggest that it is a human engaging with a broader section of text that provides ‘meaning’ and that there is a danger of using ‘word clouds’ as ‘filler’ in place of a thorough engagement with a text. 

Altmetrics

Altmetrics and Impact

This week the focus of our DITA lecture was on the use of altmetrics. I found this a particularly interesting lecture in relation to current changes in research funding and culture and academia more broadly. Some of these changes include; Open Access, Open Data, increasing emphasis by funders on ‘impact’ and ‘public engagement’ and REF1. In this post I will focus on the possible altmetrics could have on measuring impact.

Impact

The question of what impact research can, or should have, is not a new question. Should the aim of research be to persuade a small body of cohorts in a field of advances in understanding of that field? Is it also important that research has some ‘application’? Should research be accessible to the public? Is this access formal (no paywall exclusion) or should this access be conceived more broadly (research is presented in ways in which lay audiences can understand it’s meaning, research can be reutilised…)?

These questions are ones which people doing research will no doubt continue to think about. In this post I want to focus more concretely on the understanding of impact in relation to some of the more ‘mundane’ exercises and practicalities that academics working in the UK have to engage in. I hope to relate this to altmetrics and discuss the extent to which this is currently useful for this concern, and how this might change in the future.

Of particular interest to me is the role altmetrics could potentially serve in measuring impact. However, the useful of altmetrics depends largely on what type of impact is being measured.

The REF impact pilot exercise gave a range of recommendations on what could, and perhaps more importantly should count as impact. To go into them all, let alone discuss the benefits and potential problems of the way in which impact is approached in the REF would be beyond the scope of this blog. However, the first recommendation given by the pilot exercise gives a good indication to the changing way in which impact is conceived ‘officially’ i.e. by some of the major funders of research in the UK:

It is essential that impact should be defined broadly to include social, economic, cultural, environmental, health and quality of life benefits. Impact purely within academia should not be included in this part of the REF.2

Major funders like the Wellcome have reflected extensively on what kind of impact the research they fund should have and have played a large role in promoting open access policies.3 The reflection on impact takes place within larger shifts in academia. It is in this context that altmetrics are to be considered.

Altmetrics

It is probably worth giving a brief definition of what altmetrics are at this point:

In scholarly and scientific publishing, altmetrics are non-traditional metrics[1] proposed as an alternative[2] to more traditional citation impact metrics, such as impact factor and h-index.4

In the above Wikipedia definition altemetrics are defined as an alternative to traditional metrics measuring impact factor. So, what are the metrics that altmetrics attempt to measure?

Altmetric

Looking at one of the commercial services which provides altmetrics data might give an insight into some of the concerns of altmetrics. During our lab session we explored some of the features of Altmetric to explore the insights it could provide.

Altmetric provide article level metrics. What these metrics track is the engagement with an article on ‘social media sites, newspapers, government policy documents and other sources’5 This tracking attempts to ascertain how many people are engaging with an article.

As a result of this tracking Altmetric provides a ‘score’. This score is based on ‘volume’, ‘sources’ and ‘authors’.6One of the defining features of Altmetric is the ‘donut’ which shows the weighting of different components of the overall score. This can give a good idea of whether an article received a lot of attention on twitter or was blogged about. I hope to dedicate another post to digging more deeply into the features of Almetric but want to spend a little more time with it before I do that.

How useful will altmetrics be in measuring impact in the future?

It is clear there are many benefits of altmetrics for an individual wanting to assess the impact their article is having in the sources Altmetric and other companies track. This in itself could be very useful. The question that might inform continued take up of altmetrics is the extent to which they are absorbed into broader measures of impact being used in any future assessments of research quality and the interest funding bodies place in these metrics.

There was some consideration of the use of altmetrics in the lead up to the most recent REF. A review by HEFCE outlines the role of metrics in the assessment of research7 The review outlines some of the potential problems of using these metrics including bias to particular disciplines (a sociology paper making fun of hipsters is probably more likely to be tweeted about then a paper in a sub-discpline of mathematics). This in itself is not necessarily a problem.
It is also likely that any future uptake of altmetrics would not make direct comparisons across disciplines. The danger could still remain that within a discipline, if altmetrics played a major role in assessing research quality and funding then researchers might begin to direct their research towards the goal of creating only a particular type of impact: that picked up in whatever version of altmetrics is being used.

This however is not a criticism that can only be applied to altmetrics. Academics already have to concern themselves with various impact measures and will often have to take consideration of things like the REF when they plan their research and output.

It is very unlikely that altmetrics will be used as the primary measure of either ‘impact’ or ‘excellence’ soon. What is interesting about altmetrics is the different insights they can give into the way in which research is engaged with. What I am particularly interested in is some of the ways we could potentially research with the use of altmetrics. In particular it may be able to give some insights into the relation of (a certain form) of impact and whether an article is open access or not. Altmetrics are already being used to assess the impact of open access. In the future could they play a role in persuading researchers that regardless of the type research they do the ‘impact’ will be greater in an open access journal?8 I am thinking about exploring this is my DITA essay so may blog about it again as I’m working on it.


  1. The Research Excellent Framework replaces the Research Assessment Exercise and is intended to ‘assess the quality of research in UK higher education in UK higher education institutions’ In their own words: ‘The Research Excellence Framework (REF) is the new system for assessing the quality of research in UK higher education institutions (HEIs). It will replace the Research Assessment Exercise (RAE) and will be completed in 2014.http://www.ref.ac.uk 
  2. The report can be accessed here 
  3. http://www.wellcome.ac.uk/About-us/Publications/Reports/Public-engagement/WTP052365.htm 
  4. http://en.wikipedia.org/wiki/Altmetrics 
  5. Altmetric – What does Altmetric do? 
  6. Altmetric – What does Altmetric do?
  7. http://www.hefce.ac.uk/whatwedo/rsrch/howfundr/metrics/ 
  8. A common misconception about open Acccess is that it is only about making research open to ‘the public’. However often other researchers, even those with institutional backing, are also excluded by paywalls. I am surprised at how often I hit paywalls for which City University doesn’t have a subscription. This is not a criticism of City University Library but it is a reflection on how common this problem can be. 

Part 2: The nuts and bolts (I think)

This is part two of three blogs on the use of twitter for research. Part one is here.

How Tags works (probably)

What I hope to do in this post is identify some of the moving parts that make Tags work. I will not try and delve into the actual workings of the code used as that is beyond my (hopefully current) abilities. What I will try and do instead is identify the different components which interact with each other to make Tags work. At the end of the post I will highlight some of the resources which I intend to use in gaining a better understanding of the inner workings of Tags. With a bit of luck I will be in a position to attempt to build something simple myself using what I’ve learnt.

The Nuts and Bolts

An API

The Twitter API is what allows access to the data of twitter. Having access to an API for twitter is required before we can think of developing methods of collecting Tweets. Once an API is set up and authorised we can start using other software to make use of the data in Twitter.

Google Apps Scripts

Google Apps Scripts is used to get data form Twitter, through the API, into a google spreadsheet. This script is what is going on behind the scenes when you ‘Run Tags’. It is these scripts which ‘talk’ to twitter in order to get our search, for example my search for #open access from my Google spreadsheet to twitter and back again with the appropriate tweets in tow.

Some computer languages

Google Apps Script makes use of Javascript – a common language used for web development. It is cloud based and many scripts can already be found which can be adapted for different purposes.

HTML – is used to make the TAGS website itself. HTML is a fairly accessible language and since it’s primary purpose is formatting text for websites, the logic we need to approach it with isn’t so unfamiliar to us.

Alongside this are a whole host of other languages which are running in the background. The purpose of this post isn’t to be exhaustive but to try and give myself a better understanding of some aspects of coding I may want to pursue.

Scary gobbledegook!

Before highlighting some of the resources available for learning more about this coding business I decided to have a look at a little bit of code myself and see if I could make any sense of it. This is a bit of code which updates tags so it works with the new version of Twitter’s API. You can have a look at the code on Github.

There is a lot I don’t understand about the inner workings and syntax of the code but hopefully I can point out some of the general principles. This is largely an exercise in trying to convince myself (and with a bit of luck you!) that this coding business is not beyond comprehension.

I have cut some snippets of the code and will summarise what I think each bit is roughly about. It is quite possible I am completely of the mark (feel free to tell me if this is the case).

function getTweets(searchTerm, maxResults, sinceid, languageCode) {
    //Based on Mikael Thuneberg getTweets - mod by mhawksey to convert to json
    // if you include setRowsData this can be used to output chosen entries

This is a description of what the purpose of the code is.

  var data = [];
  var idx = 0;
  var ss = SpreadsheetApp.getActiveSpreadsheet();
  var sumSheet = ss.getSheetByName("Readme/Settings");
  if (isConfigured()){
   var oauthConfig = UrlFetchApp.addOAuthService("twitter");
    oauthConfig.setAccessTokenUrl("https://api.twitter.com/oauth/access_token");
    oauthConfig.setRequestTokenUrl("https://api.twitter.com/oauth/request_token");
    oauthConfig.setAuthorizationUrl("https://api.twitter.com/oauth/authorize");
    oauthConfig.setConsumerKey(getConsumerKey());
    oauthConfig.setConsumerSecret(getConsumerSecret());
    var requestData = {
          "oAuthServiceName": "twitter",
          "oAuthUseToken": "always"
        };
  } else {
    Browser.msgBox("Twitter API Configuration Required")
  }

Code is often written with a similar basic logic (it does get more complicated of course). This logic often goes along the lines of: trying doing something, if that doesn’t work do this other thing. This section of code is I think trying to establish a link between the spreadsheet and twitters API. The ‘if’ is trying to get the Oauth access token.[1] The ‘else’ tells the programme what to do if the API hasn’t been authorised. It will display a message saying ‘Twitter API Configuration Required’.

The rest of the code starts to make less intuitive sense to me. I will try to understand what this means at some point!

Resources

Here is a small list of resources which I intend to pursue. They are in a somewhat rough order based on the criteria of what I think will be both most accessible and immediately useful to me (and maybe more generally for non computer expert librarians).

HTML and CSS

HTML is probably one of the most accessible languages to get a basic grip on. I already have some experience with HTML and LaTeX[2] which is fairly similar. I don’t have any experience with CSS but it is included with many of the HTML tutorials and since it is a big part of styling websites it makes sense to try to learn this alongside HTML.

  1. Code Academy
  2. W3schools
  3. http://learn.shayhowe.com/html-css/

There are plenty of other sites available and many free resources so it is probably best to just try them out and see what you like.

Google Apps Scripts/Javascript

Since Tags works using Google Apps scripts I am intrigued to explore how this works. Google provides some introductions and tutorials on their site. Google scripts works on the basis of Javascript and it is another language heavily used on the web so I think it makes sense to explore this a little bit.

  1. Google Apps Scripts
  2. Code Academy Javascript
Ruby

Another language commonly used on the web. I have often heard it described as a very ‘elegant’ coding language. Might as well learn a bit of an ‘elegant’ language alongside a more ‘ugly’ language like Javascript! There also seems to be lots of good resources for this so it is definitely on my to do list.

http://tryruby.org/
This is a fun website that gives you a chance to try some basic coding using Ruby.

  1. Google Apps Scripts
  2. Try Ruby
  3. Introduction to Ruby comic!

There is quite a lot to get on with here. Whether I find time to really ‘learn’ any of these languages fully during the rest of the DITA language is doubtful but I do hope to make a start and see how I get on. If anyone has any tips feel free to drop a comment below!


[1] This was discussed briefly in the DITA lecture. It is is essentially an open standard for ensuring secure authorisation of different applications. There site is here http://oauth.net/

[2] LaTeX is a ‘document preparation system’. It can be used instead of a ‘what you see is what you get’ (WYSIWYG) programme like Word or LibreOffice to prepare text documents. Instead of formatting everything by hand as you do with Word or LibreOffice you indicate the structure with some syntax and LaTeX ‘typesets’ the document for you. It tends to produce much more attractive documents then WYSIWYG programmes and is pretty easy to use once you have got a hang of the logic behind it.

Twitter research: some possibilities

Tags and Data Visualisation

I’ve decided to split this post into three parts. . In the first post I want to discuss some of the things we did during this weeks lab exercise. In the second part I will discuss the way in which (I think) tags works and discuss some resources that I intended to use so I can answer the question of ‘how it works’ with a little more authority. In the third part I will very quickly give my take on the why we might want to archive, and research twitter and what potential ethical questions this raises.

Archiving Tweets

One of the issues highlighted in this weeks reading and lecture was the difficulty of archiving and analysing Tweets. Some of the main problems we are confronted with is how to deal with the ‘speed’ at which Twitter moves. Alongside this we also have the additional problem of the massive volume of data we may end up dealing with. If we attempted to manually collect these Tweets we’d become overwhelmed pretty quickly. At the moment we are still in the somewhat early stages of doing research using Twitter but there are some methods currently available that go someway to overcoming these difficulties.

At this point the question of why we’d even want to archive and analyse Twitter might come up. I will wait to address this question in the third part of the post.

Tags

This week we followed on from last weeks lab session and made use of the API we had begun to set up in last weeks session. The aim of the session was; to set up a method of archiving tweets in a Google Drive spreadsheet, begin to analyse some of the data within those tweets, and finally to present some of this data in the form of visualisations. In order to do this we made use of Tags.

My results

I won’t cover all the steps to setting up Tags as I think the instructions on the site are fairly straightforward. I wanted to instead present some of the outputs that I came up with.

I used Tags to archive and visualise tweets using the hashtag #openAccess[1]. A screenshot of the twitter feed for #openaccess currently (28.10.14, 14:08:07) looks like this:

Screen Shot 2014-10-28 at 14.08.47

This is perfectly acceptable when browsing browsing hashtags over a coffee but it isn’t particularly clear how you would approach this to carry out research on the way people use twitter to discuss open access. Even the archive of tweets produced using Tags already provides a better means of viewing some of the activity on Twitter for #Openaccess.

Screen Shot 2014-10-28 at 14.23.23

However, Tags also provides other reporting on #openaccess such as the top tweeters using the hashtag:
toptweetOA

Tags also allows the option of generating some potentially helpful visualisations these could be particularly helpful if trying to get a ‘birds eye view’ of some of the activity that is taking place within twitter.

What next?

Trying out tags is probably the best way of seeing what features it offers. There are some limitations to what is possible to do with Tags due to the way it works and I will discuss these and the resources I intend to follow up in the next post.

[1] The reason I chose Open Access was that is likely to be the focus of my dissertation topic. Open Access by Peter Suber offers a good introduction to Open Access. You can find an open access PDF here or a HTML version here.

APIs part 2.

Continued from Part 1

Using some APIs

  1. The first API I tried was to embed my Twitter on the homepage of my blog.
    • This seemed to take a couple of tries to display correctly on my blog. I am not sure if this was down to the theme I was using before or something I was doing wrong. There still seems to be some issue with it displaying in my chrome browser but I suspect this is because of the numerous plugins I have installed.
  2. The second API I used was for Google Maps. For this one I just followed the instructions WordPress provides for embedding Google maps. I just wanted to try it out so searched for ‘libraries’ in London on Google maps. I then embedded the results as a page on my blog. I’m not sure whether it will stay there or not. This is the map that google produced:

What next?

I enjoyed my initial attempts to use APIs on my blog. I can see how even using some basic APIs can be a powerful tool. I have been reading through the WordPress documentation for ‘shortcodes’ and will try out some other ones in future posts. I will also try and delve a little deeper into some of the more complex possibilities and see how I get on.

For this post I also enabled Markdown editing on WordPress. Markdown is intended to make writing and reading a marked up document easier. So far it seems to be a helpful way of writing more fluently without having to type a lot of HTML or constantly click on formatting options in the WordPress editor. I will try and do a separate post in the future explaining how to use Markdown for WordPress and why I think it makes sense to get used to writing for the web using Markdown.

APIs part 1.

This week in DITA the topic was APIs. I had encountered the term before and had a pretty good idea of what they were.1 I hadn’t spent much time previously trying to use APIs, or at least not deliberately. It turns out that many of the pages I use on a regular basis make use of APIs. Before getting to that though it might be worth giving a brief overview of what an API is.

What is an API?

“In computer programming, an application programming interface (API) specifies a software component in terms of its operations, their inputs and outputs and underlying types. Its main purpose is to define a set of functionalities that are independent of their respective implementation, allowing both definition and implementation to vary without compromising each other.
In addition to accessing databases or computer hardware, such as hard disk drives or video cards, an API can be used to ease the work of programming graphical user interface components, to allow integration of new features into existing applications (a so-called “plug-in API”), or to share data between otherwise distinct applications. In practice, many times an API comes in the form of a library that includes specifications for routines, data structures, object classes, and variables. In some other cases, notably for SOAP and REST services, an API comes as just a specification of remote calls exposed to the API consumers.” Wikipedia

When I read that explanation before the DITA lecture and lab it didn’t help much to clarify what an API was. However, in the context of the lecture and the lab exercises it starts to make a lot more sense. In particular the purpose of ‘defin[ing] a set of functionalities that are independent of their respective implementation’. An API allows some functionalities i.e. features, to be used independently of their respective implementations. Different ‘implementations’, otherwise known as applications, can use the basic features of a website like twitter in a new way.

What are the benefits of APIs?

Rather than talk about the programming logic that makes API work well I thought I would try and come up with some of the obvious ways in which an API is useful from the ‘users’ perspective.

  • I can use a service like Twitter through different applications that may have an interface which for me is more intuitive then the web version of twitter, or has additional features which make it more efficient to use for particular activities.

  • By providing APIs projects which have limited resources, or time, can allow other developers to open up access to a service on a different application. An example of this is Scholarley, an unofficial android client for Mendeley

  • A service like Mendeley can be used in many different ways, by many different people. Mendeley provides an extensive documentation for its API here. In this way there is some overlap between what APIs and open source software allow, although there are major difference between the two, something I will try and explore in a later post.

I will continue exploring APIs in part 2 of this post.


  1. My understanding was something along the lines: an API is a way in which a website like Twitter can be used by an application like Tweetdeck or Fenix and allow you to interact with the twitter service through an app. I don’t think this understanding was too far off but the lecture and lab definitely clarified how this works.