Title: | Comprehensive Science Mapping Analysis |
---|---|
Description: | Tool for quantitative research in scientometrics and bibliometrics. It implements the comprehensive workflow for science mapping analysis proposed in Aria M. and Cuccurullo C. (2017) <doi:10.1016/j.joi.2017.08.007>. 'bibliometrix' provides various routines for importing bibliographic data from 'SCOPUS', 'Clarivate Analytics Web of Science' (<https://www.webofknowledge.com/>), 'Digital Science Dimensions' (<https://www.dimensions.ai/>), 'OpenAlex' (<https://openalex.org/>), 'Cochrane Library' (<https://www.cochranelibrary.com/>), 'Lens' (<https://lens.org>), and 'PubMed' (<https://pubmed.ncbi.nlm.nih.gov/>) databases, performing bibliometric analysis and building networks for co-citation, coupling, scientific collaboration and co-word analysis. |
Authors: | Massimo Aria [cre, aut, cph] , Corrado Cuccurullo [aut] |
Maintainer: | Massimo Aria <[email protected]> |
License: | GPL-3 |
Version: | 4.3.1 |
Built: | 2025-01-22 09:33:30 UTC |
Source: | https://github.com/massimoaria/bibliometrix |
Tool for quantitative research in scientometrics and bibliometrics. It implements the comprehensive workflow for science mapping analysis proposed in Aria M. and Cuccurullo C. (2017) <doi:10.1016/j.joi.2017.08.007>. 'bibliometrix' provides various routines for importing bibliographic data from 'SCOPUS', 'Clarivate Analytics Web of Science' (<https://www.webofknowledge.com/>), 'Digital Science Dimensions' (<https://www.dimensions.ai/>), 'OpenAlex' (<https://openalex.org/>), 'Cochrane Library' (<https://www.cochranelibrary.com/>), 'Lens' (<https://lens.org>), and 'PubMed' (<https://pubmed.ncbi.nlm.nih.gov/>) databases, performing bibliometric analysis and building networks for co-citation, coupling, scientific collaboration and co-word analysis.
INSTALLATION
- Stable version from CRAN:
install.packages("bibliometrix")
- Or development version from GitHub:
install.packages("devtools") devtools::install_github("massimoaria/bibliometrix")
- Load "bibliometrix"
library('bibliometrix')
DATA LOADING AND CONVERTING
The export file can be imported and converted by R using the function *convert2df*:
file <- ("https://www.bibliometrix.org/datasets/savedrecs.txt")
M <- convert2df(file, dbsource = "wos", format = "bibtex")
*convert2df* creates a bibliographic data frame with cases corresponding to manuscripts and variables to Field Tag in the original export file. Each manuscript contains several elements, such as authors' names, title, keywords and other information. All these elements constitute the bibliographic attributes of a document, also called metadata. Data frame columns are named using the standard Clarivate Analytics WoS Field Tag codify.
BIBLIOMETRIC ANALYSIS
The first step is to perform a descriptive analysis of the bibliographic data frame. The function *biblioAnalysis* calculates main bibliometric measures using this syntax:
results <- biblioAnalysis(M, sep = ";")
The function *biblioAnalysis* returns an object of class "bibliometrix".
To summarize main results of the bibliometric analysis, use the generic function *summary*. It displays main information about the bibliographic data frame and several tables, such as annual scientific production, top manuscripts per number of citations, most productive authors, most productive countries, total citation per country, most relevant sources (journals) and most relevant keywords. *summary* accepts two additional arguments. *k* is a formatting value that indicates the number of rows of each table. *pause* is a logical value (TRUE or FALSE) used to allow (or not) pause in screen scrolling. Choosing k=10 you decide to see the first 10 Authors, the first 10 sources, etc.
S <- summary(object = results, k = 10, pause = FALSE)
Some basic plots can be drawn using the generic function plot:
plot(x = results, k = 10, pause = FALSE)
BIBLIOGRAPHIC NETWORK MATRICES
Manuscript's attributes are connected to each other through the manuscript itself: author(s) to journal, keywords to publication date, etc. These connections of different attributes generate bipartite networks that can be represented as rectangular matrices (Manuscripts x Attributes). Furthermore, scientific publications regularly contain references to other scientific works. This generates a further network, namely, co-citation or coupling network. These networks are analyzed in order to capture meaningful properties of the underlying research system, and in particular to determine the influence of bibliometric units such as scholars and journals.
*biblioNetwork* function
The function *biblioNetwork* calculates, starting from a bibliographic data frame, the most frequently used networks: Coupling, Co-citation, Co-occurrences, and Collaboration. *biblioNetwork* uses two arguments to define the network to compute: - *analysis* argument can be "co-citation", "coupling", "collaboration", or "co-occurrences". - *network* argument can be "authors", "references", "sources", "countries", "universities", "keywords", "author_keywords", "titles" and "abstracts".
i.e. the following code calculates a classical co-citation network:
NetMatrix <- biblioNetwork(M, analysis = "co-citation", network = "references", sep = ";")
VISUALIZING BIBLIOGRAPHIC NETWORKS
All bibliographic networks can be graphically visualized or modeled. Using the function *networkPlot*, you can plot a network created by *biblioNetwork* using R routines.
The main argument of *networkPlot* is type. It indicates the network map layout: circle, kamada-kawai, mds, etc.
In the following, we propose some examples.
### Country Scientific Collaboration
# Create a country collaboration network
M <- metaTagExtraction(M, Field = "AU_CO", sep = ";")
NetMatrix <- biblioNetwork(M, analysis = "collaboration", network = "countries", sep = ";")
# Plot the network
net=networkPlot(NetMatrix, n = dim(NetMatrix)[1], Title = "Country Collaboration", type = "circle", size=TRUE, remove.multiple=FALSE,labelsize=0.8)
### Co-Citation Network
# Create a co-citation network
NetMatrix <- biblioNetwork(M, analysis = "co-citation", network = "references", sep = ";")
# Plot the network
net=networkPlot(NetMatrix, n = 30, Title = "Co-Citation Network", type = "fruchterman", size=T, remove.multiple=FALSE, labelsize=0.7,edgesize = 5)
### Keyword co-occurrences
# Create keyword co-occurrences network
NetMatrix <- biblioNetwork(M, analysis = "co-occurrences", network = "keywords", sep = ";")
# Plot the network
net=networkPlot(NetMatrix, normalize="association", weighted=T, n = 30, Title = "Keyword Co-occurrences", type = "fruchterman", size=T,edgesize = 5,labelsize=0.7)
CO-WORD ANALYSIS: THE CONCEPTUAL STRUCTURE OF A FIELD
The aim of the co-word analysis is to map the conceptual structure of a framework using the word co-occurrences in a bibliographic collection. The analysis can be performed through dimensionality reduction techniques such as Multidimensional Scaling (MDS), Correspondence Analysis (CA) or Multiple Correspondence Analysis (MCA). Here, we show an example using the function *conceptualStructure* that performs a CA or MCA to draw a conceptual structure of the field and K-means clustering to identify clusters of documents which express common concepts. Results are plotted on a two-dimensional map. *conceptualStructure* includes natural language processing (NLP) routines (see the function *termExtraction*) to extract terms from titles and abstracts. In addition, it implements the Porter's stemming algorithm to reduce inflected (or sometimes derived) words to their word stem, base or root form.
# Conceptual Structure using keywords (method="MCA")
CS <- conceptualStructure(M,field="ID", method="MCA", minDegree=4, clust=4 ,k.max=8, stemming=FALSE, labelsize=10, documents=10)
HISTORICAL DIRECT CITATION NETWORK
The historiographic map is a graph proposed by E. Garfield to represent a chronological network map of most relevant direct citations resulting from a bibliographic collection. The function histNetwork generates a chronological direct citation network matrix which can be plotted using *histPlot*:
# Create a historical citation network
histResults <- histNetwork(M, sep = ";")
# Plot a historical co-citation network
net <- histPlot(histResults, size = 10)
Massimo Aria [cre, aut, cph] (<https://orcid.org/0000-0002-8517-9411>), Corrado Cuccurullo [aut] (<https://orcid.org/0000-0002-7401-8575>)
Maintainer: Massimo Aria <[email protected]>
Aria, M. & Cuccurullo, C. (2017). *bibliometrix*: An R-tool for comprehensive science mapping analysis, *Journal of Informetrics*, 11(4), pp 959-975, Elsevier, DOI: 10.1016/j.joi.2017.08.007 (https://doi.org/10.1016/j.joi.2017.08.007).
Cuccurullo, C., Aria, M., & Sarto, F. (2016). Foundations and trends in performance management. A twenty-five years bibliometric analysis in business and public administration domains, *Scientometrics*, DOI: 10.1007/s11192-016-1948-8 (https://doi.org/10.1007/s11192-016-1948-8).
Cuccurullo, C., Aria, M., & Sarto, F. (2015). Twenty years of research on performance management in business and public administration domains. Presentation at the *Correspondence Analysis and Related Methods conference (CARME 2015)* in September 2015 (https://www.bibliometrix.org/documents/2015Carme_cuccurulloetal.pdf).
Sarto, F., Cuccurullo, C., & Aria, M. (2014). Exploring healthcare governance literature: systematic review and paths for future research. *Mecosan* (https://www.francoangeli.it/Riviste/Scheda_Rivista.aspx?IDarticolo=52780&lingua=en).
Cuccurullo, C., Aria, M., & Sarto, F. (2013). Twenty years of research on performance management in business and public administration domains. In *Academy of Management Proceedings* (Vol. 2013, No. 1, p. 14270). Academy of Management (https://doi.org/10.5465/AMBPP.2013.14270abstract).
It calculates and plots the author production (in terms of number of publications) over the time.
authorProdOverTime(M, k = 10, graph = TRUE)
authorProdOverTime(M, k = 10, graph = TRUE)
M |
is a bibliographic data frame obtained by |
k |
is a integer. It is the number of top authors to analyze and plot. Default is |
graph |
is logical. If TRUE the function plots the author production over time graph. Default is |
The function authorProdOverTime
returns a list containing two objects:
dfAU |
is a data frame | |
dfpapersAU
|
is a data frame | |
graph |
a ggplot object |
biblioAnalysis
function for bibliometric analysis
summary
method for class 'bibliometrix
'
data(scientometrics, package = "bibliometrixData") res <- authorProdOverTime(scientometrics, k=10) print(res$dfAU) plot(res$graph)
data(scientometrics, package = "bibliometrixData") res <- authorProdOverTime(scientometrics, k=10) print(res$dfAU) plot(res$graph)
It performs a bibliometric analysis of a dataset imported from SCOPUS and Clarivate Analytics Web of Science databases.
biblioAnalysis(M, sep = ";")
biblioAnalysis(M, sep = ";")
M |
is a bibliographic data frame obtained by the converting function |
sep |
is the field separator character. This character separates strings in each column of the data frame. The default is |
biblioAnalysis
returns an object of class
"bibliometrix".
The functions summary
and plot
are used to obtain or print a summary and some useful plots of the results.
An object of class
"bibliometrix" is a list containing the following components:
Articles | the total number of manuscripts | |
Authors | the authors' frequency distribution | |
AuthorsFrac | the authors' frequency distribution (fractionalized) | |
FirstAuthors | corresponding author of each manuscript | |
nAUperPaper | the number of authors per manuscript | |
Appearances | the number of author appearances | |
nAuthors | the number of authors | |
AuMultiAuthoredArt | the number of authors of multi-authored articles | |
MostCitedPapers | the list of manuscripts sorted by citations | |
Years | publication year of each manuscript | |
FirstAffiliation | the affiliation of the first author | |
Affiliations | the frequency distribution of affiliations (of all co-authors for each paper) | |
Aff_frac | the fractionalized frequency distribution of affiliations (of all co-authors for each paper) | |
CO | the affiliation country of the first author | |
Countries | the affiliation countries' frequency distribution | |
CountryCollaboration | Intra-country (SCP) and intercountry (MCP) collaboration indices | |
TotalCitation | the number of times each manuscript has been cited | |
TCperYear | the yearly average number of times each manuscript has been cited | |
Sources | the frequency distribution of sources (journals, books, etc.) | |
DE | the frequency distribution of authors' keywords | |
ID | the frequency distribution of keywords associated to the manuscript by SCOPUS and Clarivate Analytics Web of Science database |
convert2df
to import and convert an WoS or SCOPUS Export file in a bibliographic data frame.
summary
to obtain a summary of the results.
plot
to draw some useful plots of the results.
## Not run: data(management, package = "bibliometrixData") results <- biblioAnalysis(management) summary(results, k = 10, pause = FALSE) ## End(Not run)
## Not run: data(management, package = "bibliometrixData") results <- biblioAnalysis(management) summary(results, k = 10, pause = FALSE) ## End(Not run)
biblioNetwork
creates different bibliographic networks from a bibliographic data frame.
biblioNetwork( M, analysis = "coupling", network = "authors", n = NULL, sep = ";", short = FALSE, shortlabel = TRUE, remove.terms = NULL, synonyms = NULL )
biblioNetwork( M, analysis = "coupling", network = "authors", n = NULL, sep = ";", short = FALSE, shortlabel = TRUE, remove.terms = NULL, synonyms = NULL )
M |
is a bibliographic data frame obtained by the converting function
|
analysis |
is a character object. It indicates the type of analysis can be performed.
|
network |
is a character object. It indicates the network typology. The |
n |
is an integer. It indicates the number of items to select. If |
sep |
is the field separator character. This character separates strings in each column of the data frame. The default is |
short |
is a logical. If TRUE all items with frequency<2 are deleted to reduce the matrix size. |
shortlabel |
is logical. IF TRUE, reference labels are stored in a short format. Default is |
remove.terms |
is a character vector. It contains a list of additional terms to delete from the documents before term extraction. The default is |
synonyms |
is a character vector. Each element contains a list of synonyms, separated by ";", that will be merged into a single term (the first word contained in the vector element). The default is |
The function biblioNetwork
can create a collection of bibliographic networks
following the approach proposed by Batagelj & Cerinsek (2013) and Aria & cuccurullo (2017).
Typical networks output of biblioNetwork
are:
#### Collaboration Networks ############
– Authors collaboration (analysis = "collaboration", network = "authors")
– University collaboration (analysis = "collaboration", network = universities")
– Country collaboration (analysis = "collaboration", network = "countries")
#### Co-citation Networks ##############
– Authors co-citation (analysis = "co-citation", network = "authors")
– Reference co-citation (analysis = "co-citation", network = "references")
– Source co-citation (analysis = "co-citation", network = "sources")
#### Coupling Networks ################
– Manuscript coupling (analysis = "coupling", network = "references")
– Authors coupling (analysis = "coupling", network = "authors")
– Source coupling (analysis = "coupling", network = "sources")
– Country coupling (analysis = "coupling", network = "countries")
#### Co-occurrences Networks ################
– Authors co-occurrences (analysis = "co-occurrences", network = "authors")
– Source co-occurrences (analysis = "co-occurrences", network = "sources")
– Keyword co-occurrences (analysis = "co-occurrences", network = "keywords")
– Author-Keyword co-occurrences (analysis = "co-occurrences", network = "author_keywords")
– Title content co-occurrences (analysis = "co-occurrences", network = "titles")
– Abstract content co-occurrences (analysis = "co-occurrences", network = "abstracts")
References:
Batagelj, V., & Cerinsek, M. (2013). On bibliographic networks. Scientometrics, 96(3), 845-864.
Aria, M., & Cuccurullo, C. (2017). bibliometrix: An R-tool for comprehensive science mapping analysis. Journal of Informetrics, 11(4), 959-975.
It is a squared network matrix. It is an object of class dgMatrix
of the package Matrix
.
convert2df
to import and convert a SCOPUS and Thomson
Reuters' ISI Web of Knowledge export file in a data frame.
cocMatrix
to compute a co-occurrence matrix.
biblioAnalysis
to perform a bibliometric analysis.
# EXAMPLE 1: Authors collaboration network # data(scientometrics, package = "bibliometrixData") # NetMatrix <- biblioNetwork(scientometrics, analysis = "collaboration", # network = "authors", sep = ";") # net <- networkPlot(NetMatrix, n = 30, type = "kamada", Title = "Collaboration",labelsize=0.5) # EXAMPLE 2: Co-citation network data(scientometrics, package = "bibliometrixData") NetMatrix <- biblioNetwork(scientometrics, analysis = "co-citation", network = "references", sep = ";") net <- networkPlot(NetMatrix, n = 30, type = "kamada", Title = "Co-Citation",labelsize=0.5)
# EXAMPLE 1: Authors collaboration network # data(scientometrics, package = "bibliometrixData") # NetMatrix <- biblioNetwork(scientometrics, analysis = "collaboration", # network = "authors", sep = ";") # net <- networkPlot(NetMatrix, n = 30, type = "kamada", Title = "Collaboration",labelsize=0.5) # EXAMPLE 2: Co-citation network data(scientometrics, package = "bibliometrixData") NetMatrix <- biblioNetwork(scientometrics, analysis = "co-citation", network = "references", sep = ";") net <- networkPlot(NetMatrix, n = 30, type = "kamada", Title = "Co-Citation",labelsize=0.5)
biblioshiny
performs science mapping analysis using the main functions of the bibliometrix package.
biblioshiny( host = "127.0.0.1", port = NULL, launch.browser = TRUE, maxUploadSize = 200 )
biblioshiny( host = "127.0.0.1", port = NULL, launch.browser = TRUE, maxUploadSize = 200 )
host |
The IPv4 address that the application should listen on. Defaults to the shiny.host option, if set, or "127.0.0.1" if not. |
port |
is the TCP port that the application should listen on. If the port is not specified, and the shiny.port option is set (with options(shiny.port = XX)), then that port will be used. Otherwise, use a random port. |
launch.browser |
If true, the system's default web browser will be launched automatically after the app is started. Defaults to true in interactive sessions only. This value of this parameter can also be a function to call with the application's URL. |
maxUploadSize |
is a integer. The max upload file size argument. Default value is 200 (megabyte) |
#biblioshiny()
#biblioshiny()
Data frame containing a list of tags and corresponding: WoS, SCOPUS and generic bibtex fields; and Dimensions.ai csv and xlsx fields.
A data frame with 44 rows and 6 variables:
Tag Fields
Scopus bibtex fields
WOS/ISI bibtex fields
Generic bibtex fields
DIMENSIONS cvs/xlsx old fields
DIMENSIONS cvs/xlsx fields
It estimates and draws the Bradford's law source distribution.
bradford(M)
bradford(M)
M |
is a bibliographic dataframe. |
Bradford's law is a pattern first described by (Samuel C. Bradford, 1934) that estimates the exponentially diminishing returns of searching for references in science journals.
One formulation is that if journals in a field are sorted by number of articles into three groups, each with about one-third of all articles,
then the number of journals in each group will be proportional to 1:n:n2.
Reference:
Bradford, S. C. (1934). Sources of information on specific subjects. Engineering, 137, 85-86.
The function bradford
returns a list containing the following objects:
table |
a dataframe with the source distribution partitioned in the three zones | |
graph |
the source distribution plot in ggplot2 format |
biblioAnalysis
function for bibliometric analysis
summary
method for class 'bibliometrix
'
## Not run: data(management, package = "bibliometrixData") BR <- bradford(management) ## End(Not run)
## Not run: data(management, package = "bibliometrixData") BR <- bradford(management) ## End(Not run)
It calculates frequency distribution of citations.
citations(M, field = "article", sep = ";")
citations(M, field = "article", sep = ";")
M |
is a bibliographic data frame obtained by the converting function |
field |
is a character. It can be "article" or "author" to obtain frequency distribution of cited citations or cited authors (only first authors for WoS database) respectively. The default is |
sep |
is the field separator character. This character separates citations in each string of CR column of the bibliographic data frame. The default is |
an object of class
"list" containing the following components:
Cited | the most frequent cited manuscripts or authors | |
Year | the publication year (only for cited article analysis) | |
Source | the journal (only for cited article analysis) |
biblioAnalysis
function for bibliometric analysis.
summary
to obtain a summary of the results.
plot
to draw some useful plots of the results.
## EXAMPLE 1: Cited articles data(scientometrics,package = "bibliometrixData") CR <- citations(scientometrics, field = "article", sep = ";") CR$Cited[1:10] CR$Year[1:10] CR$Source[1:10] ## EXAMPLE 2: Cited first authors data(scientometrics) CR <- citations(scientometrics, field = "author", sep = ";") CR$Cited[1:10]
## EXAMPLE 1: Cited articles data(scientometrics,package = "bibliometrixData") CR <- citations(scientometrics, field = "article", sep = ";") CR$Cited[1:10] CR$Year[1:10] CR$Source[1:10] ## EXAMPLE 2: Cited first authors data(scientometrics) CR <- citations(scientometrics, field = "author", sep = ";") CR$Cited[1:10]
cocMatrix
computes occurrences between elements of a Tag Field from a bibliographic data frame. Manuscript is the unit of analysis.
cocMatrix( M, Field = "AU", type = "sparse", n = NULL, sep = ";", binary = TRUE, short = FALSE, remove.terms = NULL, synonyms = NULL )
cocMatrix( M, Field = "AU", type = "sparse", n = NULL, sep = ";", binary = TRUE, short = FALSE, remove.terms = NULL, synonyms = NULL )
M |
is a data frame obtained by the converting function
|
||||||||||||||||||
Field |
is a character object. It indicates one of the field tags of the standard ISI WoS Field Tag codify. Field can be equal to one of these tags:
for a complete list of filed tags see:
Field Tags used in bibliometrix |
||||||||||||||||||
type |
indicates the output format of co-occurrences:
|
||||||||||||||||||
n |
is an integer. It indicates the number of items to select. If |
||||||||||||||||||
sep |
is the field separator character. This character separates strings in each
column of the data frame. The default is |
||||||||||||||||||
binary |
is a logical. If TRUE each cell contains a 0/1. if FALSE each cell contains the frequency. |
||||||||||||||||||
short |
is a logical. If TRUE all items with frequency<2 are deleted to reduce the matrix size. |
||||||||||||||||||
remove.terms |
is a character vector. It contains a list of additional terms to delete from the documents before term extraction. The default is |
||||||||||||||||||
synonyms |
is a character vector. Each element contains a list of synonyms, separated by ";", that will be merged into a single term (the first word contained in the vector element). The default is |
This occurrence matrix represents a bipartite network which can be transformed into a collection of bibliographic networks such as coupling, co-citation, etc..
The function follows the approach proposed by Batagelj & Cerinsek (2013) and Aria & cuccurullo (2017).
References:
Batagelj, V., & Cerinsek, M. (2013). On bibliographic networks. Scientometrics, 96(3), 845-864.
Aria, M., & Cuccurullo, C. (2017). bibliometrix: An R-tool for comprehensive science mapping analysis. Journal of Informetrics, 11(4), 959-975.
a bipartite network matrix with cases corresponding to manuscripts and variables to the
objects extracted from the Tag Field
.
convert2df
to import and convert an ISI or SCOPUS
Export file in a data frame.
biblioAnalysis
to perform a bibliometric analysis.
biblioNetwork
to compute a bibliographic network.
# EXAMPLE 1: Articles x Authors occurrence matrix data(scientometrics, package = "bibliometrixData") WA <- cocMatrix(scientometrics, Field = "AU", type = "sparse", sep = ";") # EXAMPLE 2: Articles x Cited References occurrence matrix # data(scientometrics, package = "bibliometrixData") # WCR <- cocMatrix(scientometrics, Field = "CR", type = "sparse", sep = ";") # EXAMPLE 3: Articles x Cited First Authors occurrence matrix # data(scientometrics, package = "bibliometrixData") # scientometrics <- metaTagExtraction(scientometrics, Field = "CR_AU", sep = ";") # WCR <- cocMatrix(scientometrics, Field = "CR_AU", type = "sparse", sep = ";")
# EXAMPLE 1: Articles x Authors occurrence matrix data(scientometrics, package = "bibliometrixData") WA <- cocMatrix(scientometrics, Field = "AU", type = "sparse", sep = ";") # EXAMPLE 2: Articles x Cited References occurrence matrix # data(scientometrics, package = "bibliometrixData") # WCR <- cocMatrix(scientometrics, Field = "CR", type = "sparse", sep = ";") # EXAMPLE 3: Articles x Cited First Authors occurrence matrix # data(scientometrics, package = "bibliometrixData") # scientometrics <- metaTagExtraction(scientometrics, Field = "CR_AU", sep = ";") # WCR <- cocMatrix(scientometrics, Field = "CR_AU", type = "sparse", sep = ";")
A function to create and plot country collaboration networks by Region
collabByRegionPlot( NetMatrix, normalize = NULL, n = NULL, degree = NULL, type = "auto", label = TRUE, labelsize = 1, label.cex = FALSE, label.color = FALSE, label.n = Inf, halo = FALSE, cluster = "walktrap", community.repulsion = 0, vos.path = NULL, size = 3, size.cex = FALSE, curved = FALSE, noloops = TRUE, remove.multiple = TRUE, remove.isolates = FALSE, weighted = NULL, edgesize = 1, edges.min = 0, alpha = 0.5, verbose = TRUE )
collabByRegionPlot( NetMatrix, normalize = NULL, n = NULL, degree = NULL, type = "auto", label = TRUE, labelsize = 1, label.cex = FALSE, label.color = FALSE, label.n = Inf, halo = FALSE, cluster = "walktrap", community.repulsion = 0, vos.path = NULL, size = 3, size.cex = FALSE, curved = FALSE, noloops = TRUE, remove.multiple = TRUE, remove.isolates = FALSE, weighted = NULL, edgesize = 1, edges.min = 0, alpha = 0.5, verbose = TRUE )
NetMatrix |
is a country collaboration matrix obtained by the function |
||||||||||||||||||
normalize |
is a character. It can be "association", "jaccard", "inclusion","salton" or "equivalence" to obtain Association Strength, Jaccard, Inclusion, Salton or Equivalence similarity index respectively. The default is type = NULL. |
||||||||||||||||||
n |
is an integer. It indicates the number of vertices to plot. |
||||||||||||||||||
degree |
is an integer. It indicates the min frequency of a vertex. If degree is not NULL, n is ignored. |
||||||||||||||||||
type |
is a character object. It indicates the network map layout:
|
||||||||||||||||||
label |
is logical. If TRUE vertex labels are plotted. |
||||||||||||||||||
labelsize |
is an integer. It indicates the label size in the plot. Default is |
||||||||||||||||||
label.cex |
is logical. If TRUE the label size of each vertex is proportional to its degree. |
||||||||||||||||||
label.color |
is logical. If TRUE, for each vertex, the label color is the same as its cluster. |
||||||||||||||||||
label.n |
is an integer. It indicates the number of vertex labels to draw. |
||||||||||||||||||
halo |
is logical. If TRUE communities are plotted using different colors. Default is |
||||||||||||||||||
cluster |
is a character. It indicates the type of cluster to perform among ("none", "optimal", "louvain","leiden", "infomap","edge_betweenness","walktrap", "spinglass", "leading_eigen", "fast_greedy"). |
||||||||||||||||||
community.repulsion |
is a real. It indicates the repulsion force among network communities. It is a real number between 0 and 1. Default is |
||||||||||||||||||
vos.path |
is a character indicating the full path where VOSviewer.jar is located. |
||||||||||||||||||
size |
is integer. It defines the size of each vertex. Default is |
||||||||||||||||||
size.cex |
is logical. If TRUE the size of each vertex is proportional to its degree. |
||||||||||||||||||
curved |
is a logical or a number. If TRUE edges are plotted with an optimal curvature. Default is |
||||||||||||||||||
noloops |
is logical. If TRUE loops in the network are deleted. |
||||||||||||||||||
remove.multiple |
is logical. If TRUE multiple links are plotted using just one edge. |
||||||||||||||||||
remove.isolates |
is logical. If TRUE isolates vertices are not plotted. |
||||||||||||||||||
weighted |
This argument specifies whether to create a weighted graph from an adjacency matrix. If it is NULL then an unweighted graph is created and the elements of the adjacency matrix gives the number of edges between the vertices. If it is a character constant then for every non-zero matrix entry an edge is created and the value of the entry is added as an edge attribute named by the weighted argument. If it is TRUE then a weighted graph is created and the name of the edge attribute will be weight. |
||||||||||||||||||
edgesize |
is an integer. It indicates the network edge size. |
||||||||||||||||||
edges.min |
is an integer. It indicates the min frequency of edges between two vertices. If edge.min=0, all edges are plotted. |
||||||||||||||||||
alpha |
is a number. Legal alpha values are any numbers from 0 (transparent) to 1 (opaque). The default alpha value usually is 0.5. |
||||||||||||||||||
verbose |
is a logical. If TRUE, network will be plotted. Default is |
It is a list containing the following elements:
graph |
a network object of the class igraph
|
|
cluster_obj |
a communities object of the package igraph
|
|
cluster_res |
a data frame with main results of clustering procedure. |
## Not run: data(management, package="bibliometrixData") management <- metaTagExtraction(management, Field = "AU_CO") NetMatrix <- biblioNetwork(management, analysis = "collaboration", network = "countries") net <- collabByRegionPlot(NetMatrix, edgesize = 4, label.cex = TRUE, labelsize=2.5, weighted = TRUE, size=0.5, size.cex=TRUE, community.repulsion = 0, verbose=FALSE) cbind(names(net)) plot(net[[4]]$graph) ## End(Not run)
## Not run: data(management, package="bibliometrixData") management <- metaTagExtraction(management, Field = "AU_CO") NetMatrix <- biblioNetwork(management, analysis = "collaboration", network = "countries") net <- collabByRegionPlot(NetMatrix, edgesize = 4, label.cex = TRUE, labelsize=2.5, weighted = TRUE, size=0.5, size.cex=TRUE, community.repulsion = 0, verbose=FALSE) cbind(names(net)) plot(net[[4]]$graph) ## End(Not run)
The function conceptualStructure
creates a conceptual structure map of
a scientific field performing Correspondence Analysis (CA), Multiple Correspondence Analysis (MCA) or Metric Multidimensional Scaling (MDS) and Clustering
of a bipartite network of terms extracted from keyword, title or abstract fields.
conceptualStructure( M, field = "ID", ngrams = 1, method = "MCA", quali.supp = NULL, quanti.supp = NULL, minDegree = 2, clust = "auto", k.max = 5, stemming = FALSE, labelsize = 10, documents = 2, graph = TRUE, remove.terms = NULL, synonyms = NULL )
conceptualStructure( M, field = "ID", ngrams = 1, method = "MCA", quali.supp = NULL, quanti.supp = NULL, minDegree = 2, clust = "auto", k.max = 5, stemming = FALSE, labelsize = 10, documents = 2, graph = TRUE, remove.terms = NULL, synonyms = NULL )
M |
is a data frame obtained by the converting function
|
||||||||||||||||||
field |
is a character object. It indicates one of the field tags of the standard ISI WoS Field Tag codify. field can be equal to one of these tags:
|
||||||||||||||||||
ngrams |
is an integer between 1 and 3. It indicates the type of n-gram to extract from texts.
An n-gram is a contiguous sequence of n terms. The function can extract n-grams composed by 1, 2, 3 or 4 terms. Default value is |
||||||||||||||||||
method |
is a character object. It indicates the factorial method used to create the factorial map. Use |
||||||||||||||||||
quali.supp |
is a vector indicating the indexes of the categorical supplementary variables. It is used only for CA and MCA. |
||||||||||||||||||
quanti.supp |
is a vector indicating the indexes of the quantitative supplementary variables. It is used only for CA and MCA. |
||||||||||||||||||
minDegree |
is an integer. It indicates the minimum occurrences of terms to analyze and plot. The default value is 2. |
||||||||||||||||||
clust |
is an integer or a character. If clust="auto", the number of cluster is chosen automatically, otherwise clust can be an integer between 2 and 8. |
||||||||||||||||||
k.max |
is an integer. It indicates the maximum number of cluster to keep. The default value is 5. The max value is 20. |
||||||||||||||||||
stemming |
is logical. If TRUE the Porter's Stemming algorithm is applied to all extracted terms. The default is |
||||||||||||||||||
labelsize |
is an integer. It indicates the label size in the plot. Default is |
||||||||||||||||||
documents |
is an integer. It indicates the number of documents per cluster to plot in the factorial map. The default value is 2. It is used only for CA and MCA. |
||||||||||||||||||
graph |
is logical. If TRUE the function plots the maps otherwise they are saved in the output object. Default value is TRUE |
||||||||||||||||||
remove.terms |
is a character vector. It contains a list of additional terms to delete from the documents before term extraction. The default is |
||||||||||||||||||
synonyms |
is a character vector. Each element contains a list of synonyms, separated by ";", that will be merged into a single term (the first word contained in the vector element). The default is |
It is an object of the class list
containing the following components:
net | bipartite network | |
res | Results of CA, MCA or MDS method | |
km.res | Results of cluster analysis | |
graph_terms | Conceptual structure map (class "ggplot2") | |
graph_documents_Contrib | Factorial map of the documents with the highest contributes (class "ggplot2") | |
graph_docuemnts_TC | Factorial map of the most cited documents (class "ggplot2") |
termExtraction
to extract terms from a textual field (abstract, title,
author's keywords, etc.) of a bibliographic data frame.
biblioNetwork
to compute a bibliographic network.
cocMatrix
to compute a co-occurrence matrix.
biblioAnalysis
to perform a bibliometric analysis.
# EXAMPLE Conceptual Structure using Keywords Plus data(scientometrics, package = "bibliometrixData") CS <- conceptualStructure(scientometrics, field="ID", method="CA", stemming=FALSE, minDegree=3, k.max = 5)
# EXAMPLE Conceptual Structure using Keywords Plus data(scientometrics, package = "bibliometrixData") CS <- conceptualStructure(scientometrics, field="ID", method="CA", stemming=FALSE, minDegree=3, k.max = 5)
It converts a SCOPUS, Clarivate Analytics WoS, Dimensions, Lens.org, PubMed and COCHRANE Database export files or pubmedR and dimensionsR JSON/XML objects into a data frame, with cases corresponding to articles and variables to Field Tags as used in WoS.
convert2df( file, dbsource = "wos", format = "plaintext", remove.duplicates = TRUE )
convert2df( file, dbsource = "wos", format = "plaintext", remove.duplicates = TRUE )
file |
a character array containing a sequence of filenames coming from WoS, Scopus, Dimensions, Lens.org, OpenAlex and Pubmed. Alternatively,
|
||||||||||||||||||||||||
dbsource |
is a character indicating the bibliographic database. |
||||||||||||||||||||||||
format |
is a character indicating the SCOPUS, Clarivate Analytics WoS, and other databases export file format. |
||||||||||||||||||||||||
remove.duplicates |
is logical. If TRUE, the function will remove duplicated items checking by DOI and database ID. |
a data frame with cases corresponding to articles and variables to Field Tags in the original export file.
I.e We have three files download from Web of Science in plaintext format, file will be:
file <- c("filename1.txt", "filename2.txt", "filename3.txt")
data frame columns are named using the standard Clarivate Analytics WoS Field Tag codify. The main field tags are:
AU
|
Authors | |
TI
|
Document Title | |
SO
|
Publication Name (or Source) | |
JI
|
ISO Source Abbreviation | |
DT
|
Document Type | |
DE
|
Authors' Keywords | |
ID
|
Keywords associated by SCOPUS or WoS database | |
AB
|
Abstract | |
C1
|
Author Address | |
RP
|
Reprint Address | |
CR
|
Cited References | |
TC
|
Times Cited | |
PY
|
Year | |
SC
|
Subject Category | |
UT
|
Unique Article Identifier | |
DB
|
Database |
for a complete list of field tags see: Field Tags used in bibliometrix
# Example: # Import and convert a Web of Science collection form an export file in plaintext format: ## Not run: files <- 'https://www.bibliometrix.org/datasets/wos_plaintext.txt' M <- convert2df(file = files, dbsource = 'wos', format = "plaintext") ## End(Not run)
# Example: # Import and convert a Web of Science collection form an export file in plaintext format: ## Not run: files <- 'https://www.bibliometrix.org/datasets/wos_plaintext.txt' M <- convert2df(file = files, dbsource = 'wos', format = "plaintext") ## End(Not run)
Data frame containing a normalized index of countries.
Data are used by biblioAnalysis
function
to extract Country Field of Cited References and Authors.
A data frame with 199 rows and 4 variables:
country names
continent names
country centroid longitude
country centroid latitude
It performs a coupling network analysis and plots community detection results on a bi-dimensional map (Coupling Map).
couplingMap( M, analysis = "documents", field = "CR", n = 500, label.term = NULL, ngrams = 1, impact.measure = "local", minfreq = 5, community.repulsion = 0.1, stemming = FALSE, size = 0.5, n.labels = 1, repel = TRUE, cluster = "walktrap" )
couplingMap( M, analysis = "documents", field = "CR", n = 500, label.term = NULL, ngrams = 1, impact.measure = "local", minfreq = 5, community.repulsion = 0.1, stemming = FALSE, size = 0.5, n.labels = 1, repel = TRUE, cluster = "walktrap" )
M |
is a bibliographic dataframe. |
analysis |
is the textual attribute used to select the unit of analysis. It can be |
field |
is the textual attribute used to measure the coupling strength. It can be |
n |
is an integer. It indicates the number of units to include in the analysis. |
label.term |
is a character. It indicates which content metadata have to use for cluster labeling. It can be |
ngrams |
is an integer between 1 and 4. It indicates the type of n-gram to extract from texts.
An n-gram is a contiguous sequence of n terms. The function can extract n-grams composed by 1, 2, 3 or 4 terms. Default value is |
impact.measure |
is a character. It indicates the impact measure used to rank cluster elements (documents, authors or sources).
It can be |
minfreq |
is a integer. It indicates the minimum frequency (per thousand) of a cluster. It is a number in the range (0,1000). |
community.repulsion |
is a real. It indicates the repulsion force among network communities. It is a real number between 0 and 1. Default is |
stemming |
is logical. If it is TRUE the word (from titles or abstracts) will be stemmed (using the Porter's algorithm). |
size |
is numerical. It indicates the size of the cluster circles and is a number in the range (0.01,1). |
n.labels |
is integer. It indicates how many labels associate to each cluster. Default is |
repel |
is logical. If it is TRUE ggplot uses geom_label_repel instead of geom_label. |
cluster |
is a character. It indicates the type of cluster to perform among ("optimal", "louvain","leiden", "infomap","edge_betweenness","walktrap", "spinglass", "leading_eigen", "fast_greedy"). |
The analysis can be performed on three different units: documents, authors or sources and the coupling strength can be measured using the classical approach (coupled by references) or a novel approach based on unit contents (keywords or terms from titles and abstracts)
The x-axis measures the cluster centrality (by Callon's Centrality index) while the y-axis measures the cluster impact by Mean Normalized Local Citation Score (MNLCS). The Normalized Local Citation Score (NLCS) of a document is calculated by dividing the actual count of local citing items by the expected citation rate for documents with the same year of publication.
a list containing:
map
|
The coupling map as ggplot2 object | |
clusters
|
Centrality and Density values for each cluster. | |
data
|
A list of units following in each cluster | |
nclust
|
The number of clusters | |
NCS
|
The Normalized Citation Score dataframe | |
net
|
A list containing the network output (as provided from the networkPlot function) |
biblioNetwork
function to compute a bibliographic network.
cocMatrix
to compute a bibliographic bipartite network.
networkPlot
to plot a bibliographic network.
## Not run: data(management, package = "bibliometrixData") res <- couplingMap(management, analysis = "authors", field = "CR", n = 250, impact.measure="local", minfreq = 3, size = 0.5, repel = TRUE) plot(res$map) ## End(Not run)
## Not run: data(management, package = "bibliometrixData") res <- couplingMap(management, analysis = "authors", field = "CR", n = 250, impact.measure="local", minfreq = 3, size = 0.5, repel = TRUE) plot(res$map) ## End(Not run)
List containing a set of custom theme variables for Biblioshiny.
A list with 3 elements:
object name
attributes
CSS style
It calculates the authors' dominance ranking from an object of the class 'bibliometrix
' as proposed by Kumar & Kumar, 2008.
dominance(results, k = 10)
dominance(results, k = 10)
results |
is an object of the class ' |
k |
is an integer, used for table formatting (number of authors). Default value is 10. |
The function dominance
returns a data frame with cases corresponding to the first k
most productive authors and variables to typical field of a dominance analysis.
the data frame variables are:
Author |
Author's name | |
Dominance Factor |
Dominance Factor (DF = FAA / MAA) | |
Tot Articles |
N. of Authored Articles (TAA) | |
Single Authored |
N. of Single-Authored Articles (SAA) | |
Multi Authored |
N. of Multi-Authored Articles (MAA=TAA-SAA) | |
First Authored |
N. of First Authored Articles (FAA) | |
Rank by Articles |
Author Ranking by N. of Articles | |
Rank by DF |
Author Ranking by Dominance Factor |
biblioAnalysis
function for bibliometric analysis
summary
method for class 'bibliometrix
'
data(scientometrics, package = "bibliometrixData") results <- biblioAnalysis(scientometrics) DF=dominance(results) DF
data(scientometrics, package = "bibliometrixData") results <- biblioAnalysis(scientometrics) DF=dominance(results) DF
Search duplicated records in a dataframe.
duplicatedMatching(M, Field = "TI", exact = FALSE, tol = 0.95)
duplicatedMatching(M, Field = "TI", exact = FALSE, tol = 0.95)
M |
is the bibliographic data frame. |
Field |
is a character object. It indicates one of the field tags used to identify duplicated records. Field can be equal to one of these tags: TI (title), AB (abstract), UT (manuscript ID). |
exact |
is logical. If exact = TRUE the function searches duplicates using exact matching. If exact=FALSE, the function uses the restricted Damerau-Levenshtein distance to find duplicated documents. |
tol |
is a numeric value giving the minimum relative similarity to match two manuscripts. Default value is |
A bibliographic data frame is obtained by the converting function convert2df
.
It is a data matrix with cases corresponding to manuscripts and variables to Field Tag in the original SCOPUS and Clarivate Analytics WoS file.
The function identifies duplicated records in a bibliographic data frame and deletes them.
Duplicate entries are identified through the restricted Damerau-Levenshtein distance.
Two manuscripts that have a relative similarity measure greater than tol
argument are stored in the output data frame only once.
the value returned from duplicatedMatching
is a data frame without duplicated records.
convert2df
to import and convert an WoS or SCOPUS Export file in a bibliographic data frame.
biblioAnalysis
function for bibliometric analysis.
summary
to obtain a summary of the results.
plot
to draw some useful plots of the results.
data(scientometrics, package = "bibliometrixData") M=rbind(scientometrics[1:20,],scientometrics[10:30,]) newM <- duplicatedMatching(M, Field = "TI", exact=FALSE, tol = 0.95) dim(newM)
data(scientometrics, package = "bibliometrixData") M=rbind(scientometrics[1:20,],scientometrics[10:30,]) newM <- duplicatedMatching(M, Field = "TI", exact=FALSE, tol = 0.95) dim(newM)
It calculates the median year for each item of a field tag.
fieldByYear( M, field = "ID", timespan = NULL, min.freq = 2, n.items = 5, labelsize = NULL, remove.terms = NULL, synonyms = NULL, dynamic.plot = FALSE, graph = TRUE )
fieldByYear( M, field = "ID", timespan = NULL, min.freq = 2, n.items = 5, labelsize = NULL, remove.terms = NULL, synonyms = NULL, dynamic.plot = FALSE, graph = TRUE )
M |
is a bibliographic data frame obtained by |
field |
is a character object. It indicates one of the field tags of the standard ISI WoS Field Tag codify. |
timespan |
is a vector with the min and max year. If it is = NULL, the analysis is performed on the entire period. Default is |
min.freq |
is an integer. It indicates the min frequency of the items to include in the analysis |
n.items |
is an integer. I indicates the maximum number of items per year to include in the plot. |
labelsize |
is deprecated argument. It will be removed in the next update. |
remove.terms |
is a character vector. It contains a list of additional terms to delete from the documents before term extraction. The default is |
synonyms |
is a character vector. Each element contains a list of synonyms, separated by ";", that will be merged into a single term (the first word contained in the vector element). The default is |
dynamic.plot |
is a logical. If TRUE plot aesthetics are optimized for plotly package. |
graph |
is logical. If TRUE the function plots Filed Tag distribution by Year graph. Default is |
The function fieldByYear
returns a list containing threeobjects:
df |
is a data frame | |
df_graph
|
is a data frame with data used to build the graph | |
graph |
a ggplot object |
biblioAnalysis
function for bibliometric analysis
summary
method for class 'bibliometrix
'
data(management, package = "bibliometrixData") timespan=c(2005,2015) res <- fieldByYear(management, field = "ID", timespan = timespan, min.freq = 5, n.items = 5, graph = TRUE)
data(management, package = "bibliometrixData") timespan=c(2005,2015) res <- fieldByYear(management, field = "ID", timespan = timespan, min.freq = 5, n.items = 5, graph = TRUE)
It calculates the authors' h-index and its variants.
Hindex(M, field = "author", elements = NULL, sep = ";", years = Inf)
Hindex(M, field = "author", elements = NULL, sep = ";", years = Inf)
M |
is a bibliographic data frame obtained by the converting function |
field |
is character. It can be equal to c("author", "source"). field indicates if H-index have to be calculated for a list of authors or for a list of sources. Default
value is |
elements |
is a character vector. It contains the authors' names list or the source list for which you want to calculate the H-index. When the field is
"author", the argument has the form C("SURNAME1 N","SURNAME2 N",...), in other words, for each author: surname and initials separated by one blank space. If elements=NULL, the function calculates impact indices for all elements contained in the data frame.
i.e for the authors SEMPRONIO TIZIO CAIO and ARIA MASSIMO |
sep |
is the field separator character. This character separates authors in each string of AU column of the bibliographic data frame. The default is |
years |
is a integer. It indicates the number of years to consider for Hindex calculation. Default is Inf. |
an object of class
"list". It contains two elements: H is a data frame with h-index, g-index and m-index for each author; CitationList is a list with the bibliographic collection for each author.
convert2df
to import and convert an WoS or SCOPUS Export file in a bibliographic data frame.
biblioAnalysis
function for bibliometric analysis.
summary
to obtain a summary of the results.
plot
to draw some useful plots of the results.
### EXAMPLE 1: ### data(scientometrics, package = "bibliometrixData") authors <- c("SMALL H", "CHEN DZ") Hindex(scientometrics, field = "author", elements = authors, sep = ";")$H Hindex(scientometrics, field = "source", elements = "SCIENTOMETRICS", sep = ";")$H ### EXAMPLE 2: Garfield h-index### data(garfield, package = "bibliometrixData") indices=Hindex(garfield, field = "author", elements = "GARFIELD E", years=Inf, sep = ";") # h-index, g-index and m-index of Eugene Garfield indices$H # Papers and total citations head(indices$CitationList[[1]])
### EXAMPLE 1: ### data(scientometrics, package = "bibliometrixData") authors <- c("SMALL H", "CHEN DZ") Hindex(scientometrics, field = "author", elements = authors, sep = ";")$H Hindex(scientometrics, field = "source", elements = "SCIENTOMETRICS", sep = ";")$H ### EXAMPLE 2: Garfield h-index### data(garfield, package = "bibliometrixData") indices=Hindex(garfield, field = "author", elements = "GARFIELD E", years=Inf, sep = ";") # h-index, g-index and m-index of Eugene Garfield indices$H # Papers and total citations head(indices$CitationList[[1]])
histNetwork
creates a historical citation network from a bibliographic
data frame.
histNetwork(M, min.citations, sep = ";", network = TRUE, verbose = TRUE)
histNetwork(M, min.citations, sep = ";", network = TRUE, verbose = TRUE)
M |
is a bibliographic data frame obtained by the converting function
|
min.citations |
DEPRECATED. New algorithm does not use this parameters. It will be remove in the next version of bibliometrix. |
sep |
is the field separator character. This character separates strings
in CR column of the data frame. The default is |
network |
is logical. If TRUE, function calculates and returns also the direct citation network. If FALSE, the function returns only the local citation table. |
verbose |
is logical. If TRUE, results are printed on screen. |
histNetwork
returns an object of class
"list"
containing the following components:
NetMatrix | the historical co-citation network matrix | |
histData | the set of n most cited references | |
M | the bibliographic data frame |
convert2df
to import and convert a supported
export file in a bibliographic data frame.
summary
to obtain a summary of the results.
plot
to draw some useful plots of the results.
biblioNetwork
to compute a bibliographic network.
## Not run: data(management, package = "bibliometrixData") histResults <- histNetwork(management, sep = ";") ## End(Not run)
## Not run: data(management, package = "bibliometrixData") histResults <- histNetwork(management, sep = ";") ## End(Not run)
histPlot
plots a historical co-citation network.
histPlot( histResults, n = 20, size = 5, labelsize = 5, remove.isolates = TRUE, title_as_label = FALSE, label = "short", verbose = TRUE )
histPlot( histResults, n = 20, size = 5, labelsize = 5, remove.isolates = TRUE, title_as_label = FALSE, label = "short", verbose = TRUE )
histResults |
is an object of
is a network matrix obtained by the function |
||||||||||||
n |
is integer. It defines the number of vertices to plot. |
||||||||||||
size |
is an integer. It defines the point size of the vertices. Default value is 5. |
||||||||||||
labelsize |
is an integer. It indicates the label size in the plot. Default is |
||||||||||||
remove.isolates |
is logical. If TRUE isolates vertices are not plotted. |
||||||||||||
title_as_label |
is a logical. DEPRECATED |
||||||||||||
label |
is a character. It indicates which label type to use as node id in the historiograph. It can be |
||||||||||||
verbose |
is logical. If TRUE, results and plots are printed on screen. |
The function histPlot
can plot a historical co-citation network previously created by histNetwork
.
It is list containing: a network object of the class igraph
and a plot object of the class ggraph
.
histNetwork
to compute a historical co-citation network.
cocMatrix
to compute a co-occurrence matrix.
biblioAnalysis
to perform a bibliometric analysis.
# EXAMPLE Citation network ## Not run: data(management, package = "bibliometrixData") histResults <- histNetwork(management, sep = ";") net <- histPlot(histResults, n=20, labelsize = 5) ## End(Not run)
# EXAMPLE Citation network ## Not run: data(management, package = "bibliometrixData") histResults <- histNetwork(management, sep = ";") net <- histPlot(histResults, n=20, labelsize = 5) ## End(Not run)
Uses SCOPUS API author search to identify author identification information.
idByAuthor(df, api_key)
idByAuthor(df, api_key)
df |
is a dataframe composed of three columns:
i.e. df[1,1:3]<-c("aria","massimo","naples") When affiliation is not specified, the field df$affiliation have to be NA. i.e. df[2,1:3]<-c("cuccurullo","corrado", NA) |
|||||||||
api_key |
is a character. It contains the Elsevier API key. Information about how to obtain an API Key Elsevier API website |
a data frame with cases corresponding to authors and variables to author's information and ID got from SCOPUS.
retrievalByAuthorID
for downloading the complete author bibliographic collection from SCOPUS
## Request a personal API Key to Elsevier web page https://dev.elsevier.com/sc_apis.html # # api_key="your api key" ## create a data frame with the list of authors to get information and IDs # i.e. df[1,1:3]<-c("aria","massimo","naples") # df[2,1:3]<-c("cuccurullo","corrado", NA) ## run idByAuthor function # # authorsID <- idByAuthor(df, api_key)
## Request a personal API Key to Elsevier web page https://dev.elsevier.com/sc_apis.html # # api_key="your api key" ## create a data frame with the list of authors to get information and IDs # i.e. df[1,1:3]<-c("aria","massimo","naples") # df[2,1:3]<-c("cuccurullo","corrado", NA) ## run idByAuthor function # # authorsID <- idByAuthor(df, api_key)
It associates authors' keywords to keywords plus.
keywordAssoc(M, sep = ";", n = 10, excludeKW = NA)
keywordAssoc(M, sep = ";", n = 10, excludeKW = NA)
M |
is a bibliographic data frame obtained by the converting function |
sep |
is the field separator character. This character separates keywords in each string of ID and DE columns of the bibliographic data frame. The default is |
n |
is a integer. It indicates the number of authors' keywords to associate to each keyword plus. The default is |
excludeKW |
is character vector. It contains authors' keywords to exclude from the analysis. |
an object of class
"list".
convert2df
to import and convert a WoS or SCOPUS Export file in a bibliographic data frame.
biblioAnalysis
function for bibliometric analysis.
summary
to obtain a summary of the results.
plot
to draw some useful plots of the results.
data(scientometrics, package = "bibliometrixData") KWlist <- keywordAssoc(scientometrics, sep = ";",n = 10, excludeKW = NA) # list of first 10 Keywords plus names(KWlist) # list of first 10 authors' keywords associated to the first Keyword plus KWlist[[1]][1:10]
data(scientometrics, package = "bibliometrixData") KWlist <- keywordAssoc(scientometrics, sep = ";",n = 10, excludeKW = NA) # list of first 10 Keywords plus names(KWlist) # list of first 10 authors' keywords associated to the first Keyword plus KWlist[[1]][1:10]
It calculates yearly occurrences of top keywords/terms.
KeywordGrowth( M, Tag = "ID", sep = ";", top = 10, cdf = TRUE, remove.terms = NULL, synonyms = NULL )
KeywordGrowth( M, Tag = "ID", sep = ";", top = 10, cdf = TRUE, remove.terms = NULL, synonyms = NULL )
M |
is a data frame obtained by the converting function |
Tag |
is a character object. It indicates one of the keyword field tags of the
standard ISI WoS Field Tag codify (ID or DE) or a field tag created by |
sep |
is the field separator character. This character separates strings in each keyword column of the data frame. The default is |
top |
is a numeric. It indicates the number of top keywords to analyze. The default value is 10. |
cdf |
is a logical. If TRUE, the function calculates the cumulative occurrences distribution. |
remove.terms |
is a character vector. It contains a list of additional terms to delete from the documents before term extraction. The default is |
synonyms |
is a character vector. Each element contains a list of synonyms, separated by ";", that will be merged into a single term (the first word contained in the vector element). The default is |
an object of class data.frame
data(scientometrics, package = "bibliometrixData") topKW=KeywordGrowth(scientometrics, Tag = "ID", sep = ";", top=5, cdf=TRUE) topKW # Plotting results ## Not run: install.packages("reshape2") library(reshape2) library(ggplot2) DF=melt(topKW, id='Year') ggplot(DF,aes(Year,value, group=variable, color=variable))+geom_line ## End(Not run)
data(scientometrics, package = "bibliometrixData") topKW=KeywordGrowth(scientometrics, Tag = "ID", sep = ";", top=5, cdf=TRUE) topKW # Plotting results ## Not run: install.packages("reshape2") library(reshape2) library(ggplot2) DF=melt(topKW, id='Year') ggplot(DF,aes(Year,value, group=variable, color=variable))+geom_line ## End(Not run)
It calculates local citations (LCS) of authors and documents of a bibliographic collection.
localCitations(M, fast.search = FALSE, sep = ";", verbose = FALSE)
localCitations(M, fast.search = FALSE, sep = ";", verbose = FALSE)
M |
is a bibliographic data frame obtained by the converting function |
fast.search |
is logical. If true, the function calculates local citations only for 25 percent top cited documents. |
sep |
is the field separator character. This character separates citations in each string of CR column of the bibliographic data frame. The default is |
verbose |
is a logical. If TRUE, results are printed on screen. |
Local citations measure how many times an author (or a document) included in this collection have been cited by the documents also included in the collection.
an object of class
"list" containing author local citations and document local citations.
citations
function for citation frequency distribution.
biblioAnalysis
function for bibliometric analysis.
summary
to obtain a summary of the results.
plot
to draw some useful plots of the results.
data(scientometrics, package = "bibliometrixData") CR <- localCitations(scientometrics, sep = ";") CR$Authors[1:10,] CR$Papers[1:10,]
data(scientometrics, package = "bibliometrixData") CR <- localCitations(scientometrics, sep = ";") CR$Authors[1:10,] CR$Papers[1:10,]
The matrix contains the rgb format of the bibliometrix official logo.
A matrix with 927 rows and 800 columns.
It estimates Lotka's law coefficients for scientific productivity (Lotka A.J., 1926).
lotka(results)
lotka(results)
results |
is an object of the class ' |
Reference:
Lotka, A. J. (1926). The frequency distribution of scientific productivity. Journal of the Washington academy of sciences, 16(12), 317-323.
The function lotka
returns a list of summary statistics of the Lotka's law estimation of an object of class bibliometrix
.
the list contains the following objects:
Beta |
Beta coefficient | |
C |
Constant coefficient | |
R2 |
Goodness of Fit | |
fitted |
Fitted Values | |
p.value |
Pvalue of two-sample Kolmogorov-Smirnov test between the empirical and the theoretical Lotka's Law distribution (with Beta=2) | |
AuthorProd |
Authors' Productivity frequency table |
biblioAnalysis
function for bibliometric analysis
summary
method for class 'bibliometrix
'
data(scientometrics, package = "bibliometrixData") results <- biblioAnalysis(scientometrics) L=lotka(results) L
data(scientometrics, package = "bibliometrixData") results <- biblioAnalysis(scientometrics) L=lotka(results) L
Merge bibliographic data frames from different databases (WoS,SCOPUS, Lens, Openalex, etc-) into a single one.
mergeDbSources(..., remove.duplicated = TRUE, verbose = TRUE)
mergeDbSources(..., remove.duplicated = TRUE, verbose = TRUE)
... |
are the bibliographic data frames to merge. |
remove.duplicated |
is logical. If TRUE duplicated documents will be deleted from the bibliographic collection. |
verbose |
is logical. If TRUE, information on duplicate documents is printed on the screen. |
bibliographic data frames are obtained by the converting function convert2df
.
The function merges data frames identifying common tag fields and duplicated records.
the value returned from mergeDbSources
is a bibliographic data frame.
convert2df
to import and convert an ISI or SCOPUS Export file in a bibliographic data frame.
biblioAnalysis
function for bibliometric analysis.
summary
to obtain a summary of the results.
plot
to draw some useful plots of the results.
data(isiCollection, package = "bibliometrixData") data(scopusCollection, package = "bibliometrixData") M <- mergeDbSources(isiCollection, scopusCollection, remove.duplicated=TRUE) dim(M)
data(isiCollection, package = "bibliometrixData") data(scopusCollection, package = "bibliometrixData") M <- mergeDbSources(isiCollection, scopusCollection, remove.duplicated=TRUE) dim(M)
It extracts other field tags, different from the standard WoS/SCOPUS codify.
metaTagExtraction(M, Field = "CR_AU", sep = ";", aff.disamb = TRUE)
metaTagExtraction(M, Field = "CR_AU", sep = ";", aff.disamb = TRUE)
M |
is a data frame obtained by the converting function |
||||||||||||||||||
Field |
is a character object. New tag extracted from aggregated data is specified by this string. Field can be equal to one of these tags:
|
||||||||||||||||||
sep |
is the field separator character. This character separates strings in each column of the data frame. The default is |
||||||||||||||||||
aff.disamb |
is a logical. If TRUE and Field="AU_UN", then a disambiguation algorithm is used to identify and match scientific affiliations
(univ, research centers, etc.). The default is |
the bibliometric data frame with a new column containing data about new field tag indicated in the argument Field
.
convert2df
for importing and converting bibliographic files into a data frame.
biblioAnalysis
function for bibliometric analysis
# Example 1: First Authors for each cited reference data(scientometrics, package = "bibliometrixData") scientometrics <- metaTagExtraction(scientometrics, Field = "CR_AU", sep = ";") unlist(strsplit(scientometrics$CR_AU[1], ";")) #Example 2: Source for each cited reference data(scientometrics) scientometrics <- metaTagExtraction(scientometrics, Field = "CR_SO", sep = ";") unlist(strsplit(scientometrics$CR_SO[1], ";")) #Example 3: Affiliation country for co-authors data(scientometrics) scientometrics <- metaTagExtraction(scientometrics, Field = "AU_CO", sep = ";") scientometrics$AU_CO[1:10]
# Example 1: First Authors for each cited reference data(scientometrics, package = "bibliometrixData") scientometrics <- metaTagExtraction(scientometrics, Field = "CR_AU", sep = ";") unlist(strsplit(scientometrics$CR_AU[1], ";")) #Example 2: Source for each cited reference data(scientometrics) scientometrics <- metaTagExtraction(scientometrics, Field = "CR_SO", sep = ";") unlist(strsplit(scientometrics$CR_SO[1], ";")) #Example 3: Affiliation country for co-authors data(scientometrics) scientometrics <- metaTagExtraction(scientometrics, Field = "AU_CO", sep = ";") scientometrics$AU_CO[1:10]
It calculates the percentage of missing data in the metadata of a bibliographic data frame.
missingData(M)
missingData(M)
M |
is a bibliographic data frame obtained by |
Each metadata is assigned a status c("Excellent," "Good," "Acceptable", "Poor", "Critical," "Completely missing") depending on the percentage of missing data. In particular, the column *status* classifies the percentage of missing value in 5 categories: "Excellent" (0 "Poor" (from 20.01
The results of the function allow us to understand which analyses can be performed with bibliometrix and which cannot based on the completeness (or status) of different metadata.
The function missingData
returns a list containing two objects:
allTags |
is a data frame including results for all original metadata tags from the collection | |
mandatoryTags
|
is a data frame that included only the tags needed for analysis with bibliometrix and biblioshiny. |
data(scientometrics, package = "bibliometrixData") res <- missingData(scientometrics) print(res$mandatoryTags)
data(scientometrics, package = "bibliometrixData") res <- missingData(scientometrics) print(res$mandatoryTags)
The function net2Pajek
save a bibliographic network previously created by networkPlot
as pajek files.
net2Pajek(net, filename = "my_pajek_network", path = NULL)
net2Pajek(net, filename = "my_pajek_network", path = NULL)
net |
is a network graph object returned by the function |
filename |
is a character. It indicates the filename for Pajek export files. |
path |
is a character. It indicates the path where the files will be saved. When path="NULL, the files will be saved in the current folder. Default is NULL. |
The function returns no object but will save three Pajek files in the folder given in the "path" argument with the name "filename.clu," "filename.vec," and "filename.net."
net2VOSviewer
to export and plot the network with VOSviewer software.
## Not run: data(management, package = "bibliometrixData") NetMatrix <- biblioNetwork(management, analysis = "co-occurrences", network = "keywords", sep = ";") net <- networkPlot(NetMatrix, n = 30, type = "auto", Title = "Co-occurrence Network",labelsize=1) net2Pajek(net, filename="pajekfiles", path=NULL) ## End(Not run)
## Not run: data(management, package = "bibliometrixData") NetMatrix <- biblioNetwork(management, analysis = "co-occurrences", network = "keywords", sep = ";") net <- networkPlot(NetMatrix, n = 30, type = "auto", Title = "Co-occurrence Network",labelsize=1) net2Pajek(net, filename="pajekfiles", path=NULL) ## End(Not run)
net2VOSviewer
plots a network created with networkPlot
using VOSviewer by Nees Jan van Eck and Ludo Waltman.
net2VOSviewer(net, vos.path = NULL)
net2VOSviewer(net, vos.path = NULL)
net |
is an object created by networkPlot function. |
vos.path |
is a character indicating the full path where VOSviewer.jar is located. |
The function networkPlot
can plot a bibliographic network previously created by biblioNetwork
.
The network map can be plotted using internal R routines or using VOSviewer by Nees Jan van Eck and Ludo Waltman.
It write a .net file that can be open in VOSviewer
biblioNetwork
to compute a bibliographic network.
networkPlot
to create and plot a network object
# EXAMPLE # VOSviewer.jar have to be present in the working folder # data(scientometrics, package = "bibliometrixData") # NetMatrix <- biblioNetwork(scientometrics, analysis = "co-citation", # network = "references", sep = ";") # net <- networkPlot(NetMatrix, n = 30, type = "kamada", Title = "Co-Citation",labelsize=0.5) # net2VOSviewer(net)
# EXAMPLE # VOSviewer.jar have to be present in the working folder # data(scientometrics, package = "bibliometrixData") # NetMatrix <- biblioNetwork(scientometrics, analysis = "co-citation", # network = "references", sep = ";") # net <- networkPlot(NetMatrix, n = 30, type = "kamada", Title = "Co-Citation",labelsize=0.5) # net2VOSviewer(net)
networkPlot
plots a bibliographic network.
networkPlot( NetMatrix, normalize = NULL, n = NULL, degree = NULL, Title = "Plot", type = "auto", label = TRUE, labelsize = 1, label.cex = FALSE, label.color = FALSE, label.n = NULL, halo = FALSE, cluster = "walktrap", community.repulsion = 0.1, vos.path = NULL, size = 3, size.cex = FALSE, curved = FALSE, noloops = TRUE, remove.multiple = TRUE, remove.isolates = FALSE, weighted = NULL, edgesize = 1, edges.min = 0, alpha = 0.5, verbose = TRUE )
networkPlot( NetMatrix, normalize = NULL, n = NULL, degree = NULL, Title = "Plot", type = "auto", label = TRUE, labelsize = 1, label.cex = FALSE, label.color = FALSE, label.n = NULL, halo = FALSE, cluster = "walktrap", community.repulsion = 0.1, vos.path = NULL, size = 3, size.cex = FALSE, curved = FALSE, noloops = TRUE, remove.multiple = TRUE, remove.isolates = FALSE, weighted = NULL, edgesize = 1, edges.min = 0, alpha = 0.5, verbose = TRUE )
NetMatrix |
is a network matrix obtained by the function |
||||||||||||||||||
normalize |
is a character. It can be "association", "jaccard", "inclusion","salton" or "equivalence" to obtain Association Strength, Jaccard, Inclusion, Salton or Equivalence similarity index respectively. The default is type = NULL. |
||||||||||||||||||
n |
is an integer. It indicates the number of vertices to plot. |
||||||||||||||||||
degree |
is an integer. It indicates the min frequency of a vertex. If degree is not NULL, n is ignored. |
||||||||||||||||||
Title |
is a character indicating the plot title. |
||||||||||||||||||
type |
is a character object. It indicates the network map layout:
|
||||||||||||||||||
label |
is logical. If TRUE vertex labels are plotted. |
||||||||||||||||||
labelsize |
is an integer. It indicates the label size in the plot. Default is |
||||||||||||||||||
label.cex |
is logical. If TRUE the label size of each vertex is proportional to its degree. |
||||||||||||||||||
label.color |
is logical. If TRUE, for each vertex, the label color is the same as its cluster. |
||||||||||||||||||
label.n |
is an integer. It indicates the number of vertex labels to draw. |
||||||||||||||||||
halo |
is logical. If TRUE communities are plotted using different colors. Default is |
||||||||||||||||||
cluster |
is a character. It indicates the type of cluster to perform among ("none", "optimal", "louvain","leiden", "infomap","edge_betweenness","walktrap", "spinglass", "leading_eigen", "fast_greedy"). |
||||||||||||||||||
community.repulsion |
is a real. It indicates the repulsion force among network communities. It is a real number between 0 and 1. Default is |
||||||||||||||||||
vos.path |
is a character indicating the full path where VOSviewer.jar is located. |
||||||||||||||||||
size |
is integer. It defines the size of each vertex. Default is |
||||||||||||||||||
size.cex |
is logical. If TRUE the size of each vertex is proportional to its degree. |
||||||||||||||||||
curved |
is a logical or a number. If TRUE edges are plotted with an optimal curvature. Default is |
||||||||||||||||||
noloops |
is logical. If TRUE loops in the network are deleted. |
||||||||||||||||||
remove.multiple |
is logical. If TRUE multiple links are plotted using just one edge. |
||||||||||||||||||
remove.isolates |
is logical. If TRUE isolates vertices are not plotted. |
||||||||||||||||||
weighted |
This argument specifies whether to create a weighted graph from an adjacency matrix. If it is NULL then an unweighted graph is created and the elements of the adjacency matrix gives the number of edges between the vertices. If it is a character constant then for every non-zero matrix entry an edge is created and the value of the entry is added as an edge attribute named by the weighted argument. If it is TRUE then a weighted graph is created and the name of the edge attribute will be weight. |
||||||||||||||||||
edgesize |
is an integer. It indicates the network edge size. |
||||||||||||||||||
edges.min |
is an integer. It indicates the min frequency of edges between two vertices. If edge.min=0, all edges are plotted. |
||||||||||||||||||
alpha |
is a number. Legal alpha values are any numbers from 0 (transparent) to 1 (opaque). The default alpha value usually is 0.5. |
||||||||||||||||||
verbose |
is a logical. If TRUE, network will be plotted. Default is |
The function networkPlot
can plot a bibliographic network previously created by biblioNetwork
.
It is a list containing the following elements:
graph |
a network object of the class igraph
|
|
cluster_obj |
a communities object of the package igraph
|
|
cluster_res |
a data frame with main results of clustering procedure. |
biblioNetwork
to compute a bibliographic network.
net2VOSviewer
to export and plot the network with VOSviewer software.
cocMatrix
to compute a co-occurrence matrix.
biblioAnalysis
to perform a bibliometric analysis.
# EXAMPLE Keywordd co-occurrence network data(management, package = "bibliometrixData") NetMatrix <- biblioNetwork(management, analysis = "co-occurrences", network = "keywords", sep = ";") net <- networkPlot(NetMatrix, n = 30, type = "auto", Title = "Co-occurrence Network",labelsize=1)
# EXAMPLE Keywordd co-occurrence network data(management, package = "bibliometrixData") NetMatrix <- biblioNetwork(management, analysis = "co-occurrences", network = "keywords", sep = ";") net <- networkPlot(NetMatrix, n = 30, type = "auto", Title = "Co-occurrence Network",labelsize=1)
networkStat
calculates main network statistics.
networkStat(object, stat = "network", type = "degree")
networkStat(object, stat = "network", type = "degree")
object |
is a network matrix obtained by the function |
stat |
is a character. It indicates which statistics are to be calculated. |
type |
is a character. It indicates which centrality index is calculated. type values can be c("degree", "closeness", "betweenness","eigenvector","pagerank","hub","authority", "all"). Default is "degree". |
The function networkStat
can calculate the main network statistics from a bibliographic network previously created by biblioNetwork
.
It is a list containing the following elements:
graph |
a network object of the class igraph
|
|
network |
a communities a list with the main statistics of the network |
|
vertex |
a data frame with the main measures of centrality and prestige of vertices. |
biblioNetwork
to compute a bibliographic network.
cocMatrix
to compute a co-occurrence matrix.
biblioAnalysis
to perform a bibliometric analysis.
# EXAMPLE Co-citation network # to run the example, please remove # from the beginning of the following lines # data(scientometrics, package = "bibliometrixData") # NetMatrix <- biblioNetwork(scientometrics, analysis = "co-citation", # network = "references", sep = ";") # netstat <- networkStat(NetMatrix, stat = "all", type = "degree")
# EXAMPLE Co-citation network # to run the example, please remove # from the beginning of the following lines # data(scientometrics, package = "bibliometrixData") # NetMatrix <- biblioNetwork(scientometrics, analysis = "co-citation", # network = "references", sep = ";") # netstat <- networkStat(NetMatrix, stat = "all", type = "degree")
It calculates the normalized citation score for documents, authors and sources using both global and local citations.
normalizeCitationScore(M, field = "documents", impact.measure = "local")
normalizeCitationScore(M, field = "documents", impact.measure = "local")
M |
is a bibliographic data frame obtained by |
field |
is a character. It indicates the unit of analysis on which calculate the NCS. It can be equal to |
impact.measure |
is a character. It indicates the impact measure used to rank cluster elements (documents, authors or sources).
It can be |
The document Normalized Citation Score (NCS) of a document is calculated by dividing the actual count of citing items by the expected citation rate for documents with the same year of publication.
The MNCS of a set of documents, for example the collected works of an individual, or published on a journal, is the average of the NCS values for all the documents in the set.
The NGCS is the NCS calculated using the global citations (total citations that a document received considering the whole bibliographic database).
The NLCS is the NCS calculated using the local citations (total citations that a document received from a set of documents included in the same collection).
a dataframe.
## Not run: data(management, package = "bibliometrixData") NCS <- normalizeCitationScore(management, field = "authors", impact.measure = "local") ## End(Not run)
## Not run: data(management, package = "bibliometrixData") NCS <- normalizeCitationScore(management, field = "authors", impact.measure = "local") ## End(Not run)
It calculates a relative measure of bibliographic co-occurrences.
normalizeSimilarity(NetMatrix, type = "association")
normalizeSimilarity(NetMatrix, type = "association")
NetMatrix |
is a coupling matrix obtained by the network functions |
type |
is a character. It can be "association", "jaccard", "inclusion","salton" or "equivalence" to obtain Association Strength, Jaccard,
Inclusion, Salton or Equivalence similarity index respectively. The default is |
couplingSimilarity
calculates Association strength, Inclusion, Jaccard or Salton similarity from a co-occurrence bibliographic matrix.
The association strength is used by Van Eck and Waltman (2007) and Van Eck et al. (2006). Several works refer to the measure as the proximity index, while Leydesdorff (2008)and Zitt et al. (2000) refer to it as the probabilistic affinity (or activity) index.
The inclusion index, also called Simpson coefficient, is an overlap measure used in information retrieval.
The Jaccard index (or Jaccard similarity coefficient) gives us a relative measure of the overlap of two sets. It is calculated as the ratio between the intersection and the union of the reference lists (of two manuscripts).
The Salton index, instead, relates the intersection of the two lists to the geometric mean of the size of both sets. The square of Salton index is also called Equivalence index.
The indices are equal to zero if the intersection of the reference lists is empty.
References
Leydesdorff, L. (2008). On the normalization and visualization of author Cocitation data: Salton's cosine versus the Jaccard index.
Journal of the American Society for Information Science and Technology, 59(1), 77– 85.
Van Eck, N.J., Waltman, L., Van den Berg, J., & Kaymak, U. (2006). Visualizing the computational intelligence field.
IEEE Computational Intelligence Magazine, 1(4), 6– 10.
Van Eck, N.J., & Waltman, L. (2007). Bibliometric mapping of the computational intelligence field.
International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 15(5), 625– 645
.
Van Eck, N. J., & Waltman, L. (2009). How to normalize cooccurrence data? An analysis of some well-known similarity measures.
Journal of the American society for information science and technology, 60(8), 1635-1651.
Zitt, M., Bassecoulard, E., & Okubo, Y. (2000). Shadows of the past in international cooperation:
Collaboration profiles of the top five producers of science. Scientometrics, 47(3), 627– 657.
a similarity matrix.
biblioNetwork
function to compute a bibliographic network.
cocMatrix
to compute a bibliographic bipartite network.
data(scientometrics, package = "bibliometrixData") NetMatrix <- biblioNetwork(scientometrics, analysis = "co-occurrences", network = "keywords", sep = ";") S=normalizeSimilarity(NetMatrix, type = "association")
data(scientometrics, package = "bibliometrixData") NetMatrix <- biblioNetwork(scientometrics, analysis = "co-occurrences", network = "keywords", sep = ";") S=normalizeSimilarity(NetMatrix, type = "association")
plot
method for class 'bibliodendrogram
'
## S3 method for class 'bibliodendrogram' plot(x, ...)
## S3 method for class 'bibliodendrogram' plot(x, ...)
x |
is the object for which plots are desired. |
... |
is a generic param for plot functions. |
The function plot
draws a dendrogram.
plot
method for class 'bibliometrix
'
## S3 method for class 'bibliometrix' plot(x, ...)
## S3 method for class 'bibliometrix' plot(x, ...)
x |
is the object for which plots are desired. |
... |
can accept two arguments: |
The function plot
returns a list of plots of class ggplot2
.
The bibliometric analysis function biblioAnalysis
.
summary
to compute a list of summary statistics of the object of class bibliometrix
.
data(scientometrics, package = "bibliometrixData") results <- biblioAnalysis(scientometrics) plot(results, k = 10, pause = FALSE)
data(scientometrics, package = "bibliometrixData") results <- biblioAnalysis(scientometrics) plot(results, k = 10, pause = FALSE)
It plot a Thematic Evolution Analysis performed using the thematicEvolution
function.
plotThematicEvolution(Nodes, Edges, measure = "inclusion", min.flow = 0)
plotThematicEvolution(Nodes, Edges, measure = "inclusion", min.flow = 0)
Nodes |
is a list of nodes obtained by |
Edges |
is a list of edges obtained by |
measure |
is a character. It can be |
min.flow |
is numerical. It indicates the minimum value of measure to plot a flow. |
a sankeyPlot
thematicMap
function to create a thematic map based on co-word network analysis and clustering.
thematicMap
function to perform a thematic evolution analysis.
networkPlot
to plot a bibliographic network.
## Not run: data(managemeent, package = "bibliometrixData") years=c(2004,2015) nexus <- thematicEvolution(management,field="ID",years=years,n=100,minFreq=2) plotThematicEvolution(nexus$Nodes,nexus$Edges) ## End(Not run)
## Not run: data(managemeent, package = "bibliometrixData") years=c(2004,2015) nexus <- thematicEvolution(management,field="ID",years=years,n=100,minFreq=2) plotThematicEvolution(nexus$Nodes,nexus$Edges) ## End(Not run)
The function readFiled is deprecated. You can import and convert your export files directly using the function convert2df
.
readFiles(...)
readFiles(...)
... |
is a sequence of names of files downloaded from WOS.(in plain text or bibtex format) or SCOPUS Export file (exclusively in bibtex format). |
a character vector of length the number of lines read.
convert2df
for converting SCOPUS of ISI Export file into a dataframe
# WoS or SCOPUS Export files can be read using \code{\link{readFiles}} function: # largechar <- readFiles('filename1.txt','filename2.txt','filename3.txt') # filename1.txt, filename2.txt and filename3.txt are ISI or SCOPUS Export file # in plain text or bibtex format. # D <- readFiles('https://www.bibliometrix.org/datasets/bibliometrics_articles.txt')
# WoS or SCOPUS Export files can be read using \code{\link{readFiles}} function: # largechar <- readFiles('filename1.txt','filename2.txt','filename3.txt') # filename1.txt, filename2.txt and filename3.txt are ISI or SCOPUS Export file # in plain text or bibtex format. # D <- readFiles('https://www.bibliometrix.org/datasets/bibliometrics_articles.txt')
Uses SCOPUS API search to get information about documents on a set of authors using SCOPUS ID.
retrievalByAuthorID(id, api_key, remove.duplicated = TRUE, country = TRUE)
retrievalByAuthorID(id, api_key, remove.duplicated = TRUE, country = TRUE)
id |
is a vector of characters containing the author's SCOPUS IDs.
SCOPUS IDs con be obtained using the function |
api_key |
is a character. It contains the Elsvier API key. Information about how to obtain an API Key Elsevier API website |
remove.duplicated |
is logical. If TRUE duplicated documents will be deleted from the bibliographic collection. |
country |
is logical. If TRUE authors' country information will be downloaded from SCOPUS. |
a list containing two objects: (i) M which is a data frame with cases corresponding to articles and variables to main Field Tags named using the standard ISI WoS Field Tag codify. M includes the entire bibliographic collection downloaded from SCOPUS. The main field tags are:
AU
|
Authors | |
TI
|
Document Title | |
SO
|
Publication Name (or Source) | |
DT
|
Document Type | |
DE
|
Authors' Keywords | |
ID
|
Keywords associated by SCOPUS or ISI database | |
AB
|
Abstract | |
C1
|
Author Address | |
RP
|
Reprint Address | |
TC
|
Times Cited | |
PY
|
Year | |
UT
|
Unique Article Identifier | |
DB
|
Database |
(ii) authorDocuments which is a list containing a bibliographic data frame for each author.
LIMITATIONS: Currently, SCOPUS API does not allow to download document references. As consequence, it is not possible to perform co-citation analysis (the field CR is empty).
idByAuthor
for downloading author information and SCOPUS ID.
## Request a personal API Key to Elsevier web page https://dev.elsevier.com/sc_apis.html ## api_key="your api key" ## create a data frame with the list of authors to get information and IDs # i.e. df[1,1:3] <- c("aria","massimo","naples") # df[2,1:3] <- c("cuccurullo","corrado", "naples") ## run idByAuthor function # # authorsID <- idByAuthor(df, api_key) # ## extract the IDs # # id <- authorsID[,3] # ## create the bibliographic collection # # res <- retrievalByAuthorID(id, api_key) # # M <- res$M # the entire bibliographic data frame # M <- res$authorDocuments # the list containing a bibliographic data frame for each author
## Request a personal API Key to Elsevier web page https://dev.elsevier.com/sc_apis.html ## api_key="your api key" ## create a data frame with the list of authors to get information and IDs # i.e. df[1,1:3] <- c("aria","massimo","naples") # df[2,1:3] <- c("cuccurullo","corrado", "naples") ## run idByAuthor function # # authorsID <- idByAuthor(df, api_key) # ## extract the IDs # # id <- authorsID[,3] # ## create the bibliographic collection # # res <- retrievalByAuthorID(id, api_key) # # M <- res$M # the entire bibliographic data frame # M <- res$authorDocuments # the list containing a bibliographic data frame for each author
rpys
computes a Reference Publication Year Spectroscopy for detecting
the Historical Roots of Research Fields.
The method was introduced by Marx et al., 2014.
rpys(M, sep = ";", timespan = NULL, graph = T)
rpys(M, sep = ";", timespan = NULL, graph = T)
M |
is a data frame obtained by the converting function
|
sep |
is the cited-references separator character. This character separates cited-references in the CR
column of the data frame. The default is |
timespan |
is a numeric vector c(min year,max year). The default value is NULL (the entire timespan is considered). |
graph |
is a logical. If TRUE the function plot the spectroscopy otherwise the plot is created but not drawn down. |
Reference:
Marx, W., Bornmann, L., Barth, A., & Leydesdorff, L. (2014).
Detecting the historical roots of research fields by reference publication
year spectroscopy (RPYS). Journal of the Association for Information Science and Technology,
65(4), 751-764.
a list containing the spectroscopy (class ggplot2) and three dataframes with the number of citations per year, the list of the cited references for each year, and the reference list with citations recorded year by year, respectively.
convert2df
to import and convert an ISI or SCOPUS
Export file in a data frame.
biblioAnalysis
to perform a bibliometric analysis.
biblioNetwork
to compute a bibliographic network.
data(scientometrics, package = "bibliometrixData") res <- rpys(scientometrics, sep=";", graph = TRUE)
data(scientometrics, package = "bibliometrixData") res <- rpys(scientometrics, sep=";", graph = TRUE)
It calculates yearly published documents of the top sources.
sourceGrowth(M, top = 5, cdf = TRUE)
sourceGrowth(M, top = 5, cdf = TRUE)
M |
is a data frame obtained by the converting function |
top |
is a numeric. It indicates the number of top sources to analyze. The default value is 5. |
cdf |
is a logical. If TRUE, the function calculates the cumulative occurrences distribution. |
an object of class data.frame
data(scientometrics, package = "bibliometrixData") topSO=sourceGrowth(scientometrics, top=1, cdf=TRUE) topSO # Plotting results ## Not run: install.packages("reshape2") library(reshape2) library(ggplot2) DF=melt(topSO, id='Year') ggplot(DF,aes(Year,value, group=variable, color=variable))+geom_line() ## End(Not run)
data(scientometrics, package = "bibliometrixData") topSO=sourceGrowth(scientometrics, top=1, cdf=TRUE) topSO # Plotting results ## Not run: install.packages("reshape2") library(reshape2) library(ggplot2) DF=melt(topSO, id='Year') ggplot(DF,aes(Year,value, group=variable, color=variable))+geom_line() ## End(Not run)
networkPlot
Create a network plot with separated communities.
splitCommunities(graph, n = NULL)
splitCommunities(graph, n = NULL)
graph |
is a network plot obtained by the function |
n |
is an integer. It indicates the number of vertices to plot for each community. |
The function splitCommunities
splits communities in separated subnetworks from a bibliographic network plot previously created by networkPlot
.
It is a network object of the class igraph
biblioNetwork
to compute a bibliographic network.
networkPlot
to plot a bibliographic network.
net2VOSviewer
to export and plot the network with VOSviewer software.
cocMatrix
to compute a co-occurrence matrix.
biblioAnalysis
to perform a bibliometric analysis.
# EXAMPLE Keywordd co-occurrence network data(management, package = "bibliometrixData") NetMatrix <- biblioNetwork(management, analysis = "co-occurrences", network = "keywords", sep = ";") net <- networkPlot(NetMatrix, n = 30, type = "auto", Title = "Co-occurrence Network",labelsize=1, verbose=FALSE) graph <- splitCommunities(net$graph, n = 30)
# EXAMPLE Keywordd co-occurrence network data(management, package = "bibliometrixData") NetMatrix <- biblioNetwork(management, analysis = "co-occurrences", network = "keywords", sep = ";") net <- networkPlot(NetMatrix, n = 30, type = "auto", Title = "Co-occurrence Network",labelsize=1, verbose=FALSE) graph <- splitCommunities(net$graph, n = 30)
A character vector containing a complete list of English stopwords
Data are used by biblioAnalysis
function
to extract Country Field of Cited References and Authors.
A character vector with 665 rows.
summary
method for class 'bibliometrix
'
## S3 method for class 'bibliometrix' summary(object, ...)
## S3 method for class 'bibliometrix' summary(object, ...)
object |
is the object for which a summary is desired. |
... |
can accept two arguments: |
The function summary
computes and returns a list of summary statistics of the object of class bibliometrics
.
the list contains the following objects:
MainInformation |
Main Information about Data | |
AnnualProduction |
Annual Scientific Production | |
AnnualGrowthRate |
Annual Percentage Growth Rate | |
MostProdAuthors |
Most Productive Authors | |
MostCitedPapers |
Top manuscripts per number of citations | |
MostProdCountries |
Corresponding Author's Countries | |
TCperCountries |
Total Citation per Countries | |
MostRelSources |
Most Relevant Sources | |
MostRelKeywords |
Most Relevant Keywords |
biblioAnalysis
function for bibliometric analysis
plot
to draw some useful plots of the results.
data(scientometrics, package = "bibliometrixData") results <- biblioAnalysis(scientometrics) summary(results)
data(scientometrics, package = "bibliometrixData") results <- biblioAnalysis(scientometrics) summary(results)
summary
method for class 'bibliometrix_netstat
'
## S3 method for class 'bibliometrix_netstat' summary(object, ...)
## S3 method for class 'bibliometrix_netstat' summary(object, ...)
object |
is the object for which a summary is desired. |
... |
can accept two arguments: |
The function summary
computes and returns on display several statistics both at network and vertex level.
# to run the example, please remove # from the beginning of the following lines #data(scientometrics, package = "bibliometrixData") #NetMatrix <- biblioNetwork(scientometrics, analysis = "collaboration", # network = "authors", sep = ";") #netstat <- networkStat(NetMatrix, stat = "all", type = "degree") #summary(netstat)
# to run the example, please remove # from the beginning of the following lines #data(scientometrics, package = "bibliometrixData") #NetMatrix <- biblioNetwork(scientometrics, analysis = "collaboration", # network = "authors", sep = ";") #netstat <- networkStat(NetMatrix, stat = "all", type = "degree") #summary(netstat)
It tabulates elements from a Tag Field column of a bibliographic data frame.
tableTag( M, Tag = "CR", sep = ";", ngrams = 1, remove.terms = NULL, synonyms = NULL )
tableTag( M, Tag = "CR", sep = ";", ngrams = 1, remove.terms = NULL, synonyms = NULL )
M |
is a data frame obtained by the converting function |
Tag |
is a character object. It indicates one of the field tags of the standard ISI WoS Field Tag codify. |
sep |
is the field separator character. This character separates strings in each column of the data frame. The default is |
ngrams |
is an integer between 1 and 3. It indicates the type of n-gram to extract from titles or abstracts. |
remove.terms |
is a character vector. It contains a list of additional terms to delete from the documents before term extraction. The default is |
synonyms |
is a character vector. Each element contains a list of synonyms, separated by ";", that will be merged into a single term (the first word contained in the vector element). The default is |
tableTag
is an internal routine of main function biblioAnalysis
.
an object of class table
data(scientometrics, package = "bibliometrixData") Tab <- tableTag(scientometrics, Tag = "CR", sep = ";") Tab[1:10]
data(scientometrics, package = "bibliometrixData") Tab <- tableTag(scientometrics, Tag = "CR", sep = ";") Tab[1:10]
It extracts terms from a text field (abstract, title, author's keywords, etc.) of a bibliographic data frame.
termExtraction( M, Field = "TI", ngrams = 1, stemming = FALSE, language = "english", remove.numbers = TRUE, remove.terms = NULL, keep.terms = NULL, synonyms = NULL, verbose = TRUE )
termExtraction( M, Field = "TI", ngrams = 1, stemming = FALSE, language = "english", remove.numbers = TRUE, remove.terms = NULL, keep.terms = NULL, synonyms = NULL, verbose = TRUE )
M |
is a data frame obtained by the converting function |
||||||||||||
Field |
is a character object. It indicates the field tag of textual data :
The default is |
||||||||||||
ngrams |
is an integer between 1 and 3. It indicates the type of n-gram to extract from texts.
An n-gram is a contiguous sequence of n terms. The function can extract n-grams composed by 1, 2, 3 or 4 terms. Default value is |
||||||||||||
stemming |
is logical. If TRUE the Porter Stemming algorithm is applied to all extracted terms. The default is |
||||||||||||
language |
is a character. It is the language of textual contents ("english", "german","italian","french","spanish"). The default is |
||||||||||||
remove.numbers |
is logical. If TRUE all numbers are deleted from the documents before term extraction. The default is |
||||||||||||
remove.terms |
is a character vector. It contains a list of additional terms to delete from the corpus after term extraction. The default is |
||||||||||||
keep.terms |
is a character vector. It contains a list of compound words "formed by two or more terms" to keep in their original form in the term extraction process. The default is |
||||||||||||
synonyms |
is a character vector. Each element contains a list of synonyms, separated by ";", that will be merged into a single term (the first word contained in the vector element). The default is |
||||||||||||
verbose |
is logical. If TRUE the function prints the most frequent terms extracted from documents. The default is |
the bibliometric data frame with a new column containing terms about the field tag indicated in the argument Field
.
convert2df
to import and convert an WoS or SCOPUS Export file in a bibliographic data frame.
biblioAnalysis
function for bibliometric analysis
# Example 1: Term extraction from titles data(scientometrics, package = "bibliometrixData") # vector of compound words keep.terms <- c("co-citation analysis","bibliographic coupling") # term extraction scientometrics <- termExtraction(scientometrics, Field = "TI", ngrams = 1, remove.numbers=TRUE, remove.terms=NULL, keep.terms=keep.terms, verbose=TRUE) # terms extracted from the first 10 titles scientometrics$TI_TM[1:10] #Example 2: Term extraction from abstracts data(scientometrics) # term extraction scientometrics <- termExtraction(scientometrics, Field = "AB", ngrams = 2, stemming=TRUE,language="english", remove.numbers=TRUE, remove.terms=NULL, keep.terms=NULL, verbose=TRUE) # terms extracted from the first abstract scientometrics$AB_TM[1] # Example 3: Term extraction from keywords with synonyms data(scientometrics) # vector of synonyms synonyms <- c("citation; citation analysis", "h-index; index; impact factor") # term extraction scientometrics <- termExtraction(scientometrics, Field = "ID", ngrams = 1, synonyms=synonyms, verbose=TRUE)
# Example 1: Term extraction from titles data(scientometrics, package = "bibliometrixData") # vector of compound words keep.terms <- c("co-citation analysis","bibliographic coupling") # term extraction scientometrics <- termExtraction(scientometrics, Field = "TI", ngrams = 1, remove.numbers=TRUE, remove.terms=NULL, keep.terms=keep.terms, verbose=TRUE) # terms extracted from the first 10 titles scientometrics$TI_TM[1:10] #Example 2: Term extraction from abstracts data(scientometrics) # term extraction scientometrics <- termExtraction(scientometrics, Field = "AB", ngrams = 2, stemming=TRUE,language="english", remove.numbers=TRUE, remove.terms=NULL, keep.terms=NULL, verbose=TRUE) # terms extracted from the first abstract scientometrics$AB_TM[1] # Example 3: Term extraction from keywords with synonyms data(scientometrics) # vector of synonyms synonyms <- c("citation; citation analysis", "h-index; index; impact factor") # term extraction scientometrics <- termExtraction(scientometrics, Field = "ID", ngrams = 1, synonyms=synonyms, verbose=TRUE)
It performs a Thematic Evolution Analysis based on co-word network analysis and clustering. The methodology is inspired by the proposal of Cobo et al. (2011).
thematicEvolution( M, field = "ID", years, n = 250, minFreq = 2, size = 0.5, ngrams = 1, stemming = FALSE, n.labels = 1, repel = TRUE, remove.terms = NULL, synonyms = NULL, cluster = "walktrap" )
thematicEvolution( M, field = "ID", years, n = 250, minFreq = 2, size = 0.5, ngrams = 1, stemming = FALSE, n.labels = 1, repel = TRUE, remove.terms = NULL, synonyms = NULL, cluster = "walktrap" )
M |
is a bibliographic data frame obtained by the converting function |
field |
is a character object. It indicates the content field to use. Field can be one of c=("ID","DE","TI","AB"). Default value is |
years |
is a numeric vector of one or more unique cut points. |
n |
is numerical. It indicates the number of words to use in the network analysis |
minFreq |
is numerical. It indicates the min frequency of words included in to a cluster. |
size |
is numerical. It indicates del size of the cluster circles and is a number in the range (0.01,1). |
ngrams |
is an integer between 1 and 4. It indicates the type of n-gram to extract from texts.
An n-gram is a contiguous sequence of n terms. The function can extract n-grams composed by 1, 2, 3 or 4 terms. Default value is |
stemming |
is logical. If it is TRUE the word (from titles or abstracts) will be stemmed (using the Porter's algorithm). |
n.labels |
is integer. It indicates how many labels associate to each cluster. Default is |
repel |
is logical. If it is TRUE ggplot uses geom_label_repel instead of geom_label. |
remove.terms |
is a character vector. It contains a list of additional terms to delete from the documents before term extraction. The default is |
synonyms |
is a character vector. Each element contains a list of synonyms, separated by ";", that will be merged into a single term (the first word contained in the vector element). The default is |
cluster |
is a character. It indicates the type of cluster to perform among ("optimal", "louvain","leiden", "infomap","edge_betweenness","walktrap", "spinglass", "leading_eigen", "fast_greedy"). |
thematicEvolution
starts from two or more thematic maps created by thematicMap
function.
Reference:
Cobo, M. J., Lopez-Herrera, A. G., Herrera-Viedma, E., & Herrera, F. (2011). An approach for detecting, quantifying,
and visualizing the evolution of a research field: A practical application to the fuzzy sets theory field. Journal of Informetrics, 5(1), 146-166.
a list containing:
nets
|
The thematic nexus graph for each comparison | |
incMatrix
|
Some useful statistics about the thematic nexus |
thematicMap
function to create a thematic map based on co-word network analysis and clustering.
cocMatrix
to compute a bibliographic bipartite network.
networkPlot
to plot a bibliographic network.
## Not run: data(managemeent, package = "bibliometrixData") years=c(2004,2015) nexus <- thematicEvolution(management,field="ID",years=years,n=100,minFreq=2) ## End(Not run)
## Not run: data(managemeent, package = "bibliometrixData") years=c(2004,2015) nexus <- thematicEvolution(management,field="ID",years=years,n=100,minFreq=2) ## End(Not run)
It creates a thematic map based on co-word network analysis and clustering. The methodology is inspired by the proposal of Cobo et al. (2011).
thematicMap( M, field = "ID", n = 250, minfreq = 5, ngrams = 1, stemming = FALSE, size = 0.5, n.labels = 1, community.repulsion = 0.1, repel = TRUE, remove.terms = NULL, synonyms = NULL, cluster = "walktrap", subgraphs = FALSE )
thematicMap( M, field = "ID", n = 250, minfreq = 5, ngrams = 1, stemming = FALSE, size = 0.5, n.labels = 1, community.repulsion = 0.1, repel = TRUE, remove.terms = NULL, synonyms = NULL, cluster = "walktrap", subgraphs = FALSE )
M |
is a bibliographic dataframe. |
field |
is the textual attribute used to build up the thematic map. It can be |
n |
is an integer. It indicates the number of terms to include in the analysis. |
minfreq |
is a integer. It indicates the minimum frequency (per thousand) of a cluster. It is a number in the range (0,1000). |
ngrams |
is an integer between 1 and 4. It indicates the type of n-gram to extract from texts.
An n-gram is a contiguous sequence of n terms. The function can extract n-grams composed by 1, 2, 3 or 4 terms. Default value is |
stemming |
is logical. If it is TRUE the word (from titles or abstracts) will be stemmed (using the Porter's algorithm). |
size |
is numerical. It indicates del size of the cluster circles and is a number in the range (0.01,1). |
n.labels |
is integer. It indicates how many labels associate to each cluster. Default is |
community.repulsion |
is a real. It indicates the repulsion force among network communities. It is a real number between 0 and 1. Default is |
repel |
is logical. If it is TRUE ggplot uses geom_label_repel instead of geom_label. |
remove.terms |
is a character vector. It contains a list of additional terms to delete from the documents before term extraction. The default is |
synonyms |
is a character vector. Each element contains a list of synonyms, separated by ";", that will be merged into a single term (the first word contained in the vector element). The default is |
cluster |
is a character. It indicates the type of cluster to perform among ("optimal", "louvain","leiden", "infomap","edge_betweenness","walktrap", "spinglass", "leading_eigen", "fast_greedy"). |
subgraphs |
is a logical. If TRUE cluster subgraphs are returned. |
thematicMap
starts from a co-occurrence keyword network to plot in a
two-dimensional map the typological themes of a domain.
Reference:
Cobo, M. J., Lopez-Herrera, A. G., Herrera-Viedma, E., & Herrera, F. (2011). An approach for detecting, quantifying,
and visualizing the evolution of a research field: A practical application to the fuzzy sets theory field. Journal of Informetrics, 5(1), 146-166.
a list containing:
map
|
The thematic map as ggplot2 object | |
clusters
|
Centrality and Density values for each cluster. | |
words
|
A list of words following in each cluster | |
nclust
|
The number of clusters | |
net
|
A list containing the network output (as provided from the networkPlot function) |
biblioNetwork
function to compute a bibliographic network.
cocMatrix
to compute a bibliographic bipartite network.
networkPlot
to plot a bibliographic network.
## Not run: data(scientometrics, package = "bibliometrixData") res <- thematicMap(scientometrics, field = "ID", n = 250, minfreq = 5, size = 0.5, repel = TRUE) plot(res$map) ## End(Not run)
## Not run: data(scientometrics, package = "bibliometrixData") res <- thematicMap(scientometrics, field = "ID", n = 250, minfreq = 5, size = 0.5, repel = TRUE) plot(res$map) ## End(Not run)
Visualize the main items of three fields (e.g. authors, keywords, journals), and how they are related through a Sankey diagram.
threeFieldsPlot(M, fields = c("DE", "AU", "SO"), n = c(20, 20, 20))
threeFieldsPlot(M, fields = c("DE", "AU", "SO"), n = c(20, 20, 20))
M |
is a bibliographic data frame obtained by the converting function |
fields |
is a character vector. It indicates the fields to analyze using the standard WoS field tags.
Default is |
n |
is a integer vector. It indicates how many items to plot, for each of the three fields.
Default is |
a sankeyPlot
#data(scientometrics, package = "bibliometrixData") #threeFieldsPlot(scientometrics, fields=c("DE","AU","CR"),n=c(20,20,20))
#data(scientometrics, package = "bibliometrixData") #threeFieldsPlot(scientometrics, fields=c("DE","AU","CR"),n=c(20,20,20))
Divide a bibliographic data frame into time slice
timeslice(M, breaks = NA, k = 5)
timeslice(M, breaks = NA, k = 5)
M |
is a bibliographic data frame obtained by the converting function |
breaks |
is a numeric vector of two or more unique cut points. |
k |
is an integer value giving the number of intervals into which the data frame is to be cut. |
the value returned from split
is a list containing the data frames for each sub-period.
convert2df
to import and convert an ISI or SCOPUS Export file in a bibliographic data frame.
biblioAnalysis
function for bibliometric analysis.
summary
to obtain a summary of the results.
plot
to draw some useful plots of the results.
data(scientometrics, package = "bibliometrixData") list_df <- timeslice(scientometrics, breaks = c(1995, 2005)) names(list_df)
data(scientometrics, package = "bibliometrixData") list_df <- timeslice(scientometrics, breaks = c(1995, 2005)) names(list_df)
Deleting leading and ending white spaces from a character
object.
trim(x)
trim(x)
x |
is a |
tableTag
is an internal routine of bibliometrics
package.
an object of class character
char <- c(" Alfred", "Mary", " John") char trim(char)
char <- c(" Alfred", "Mary", " John") char trim(char)
Deleting leading white spaces from a character
object.
trim.leading(x)
trim.leading(x)
x |
is a |
tableTag
is an internal routine of bibliometrics
package.
an object of class character
char <- c(" Alfred", "Mary", " John") char trim.leading(char)
char <- c(" Alfred", "Mary", " John") char trim.leading(char)
Deleting extra white spaces from a character
object.
trimES(x)
trimES(x)
x |
is a |
tableTag
is an internal routine of bibliometrics
package.
an object of class character
char <- c("Alfred BJ", "Mary Beth", "John John") char trimES(char)
char <- c("Alfred BJ", "Mary Beth", "John John") char trimES(char)