Commit 3f17dacc authored by peter's avatar peter

Initial AlfBI docs

parent 09db4a51
# Change Log
This is the change log for Parashift's [AlfBI Module](../paramodules/
All notable changes to this project will be documented in this file.
This project adheres to [Semantic Versioning](
## [3.0.0] 2017-09-11
### Changed
* Renamed Project from `Paran` to `AlfBI`
## [2.3.0] 2017-09-01
### Added
* Ability to modify version label of document via bulk import. Parashift versioning module is required to be installed in order for this feature to work.
## [2.2.0] 2017-08-18
### Added
* Workflow, System and Audit Repo Endpoints
* Workflow, System and Audit Camel Components
* Default camel route for AlfBI for use with deploying all components
* Metabase Dashboards for use with importing via the `mbexport` cli tool
### Changed
* Points to the new Parashift Maven Repository
* Metabase DAO sync is off by default. This is not needed now that
Metabase supports LDAP
## [2.1.0] 2017-03-01
### Added
* Ability to update tags
* Ability to move the node's path to a different folder. It will
automatically create the folder if none exists
### Changed
* Adjusted the error messaging and transaction handling so that you will see more types of errors. Integrity violation errors only happen on transaction commit, which previously was too late in the process to show to the browser.
### Fixed
* Fixed some situations where transaction locks could persist due to being marked for rollback but not being rolled back.
## [2.0.0] 2016-10-27
### Changed
* Migrate to Metabase for reporting
## [1.1.1] 2016-05-23
### Changed
* Error messages via Bulk Upload are presented to the end user
## [1.1.0] 2016-03-10
### Added
* Include Banana Dashboard store in Alfresco
## [1.0.0] 2016-02-12
* Initial Commit
\ No newline at end of file
# Bulk Upload
AlfBI enables the bulk upload of metadata via a CSV file, allowing users the ability to make mass changes to nodes within Alfresco.
This includes:
* Property values such as filename and description
* `(From 2.1.0 onwards)` Tags
* `(From 2.1.0 onwards)` Node Paths
* `(From 2.3.0 onwards)` Node Versions (With the [Version Module](./
Rather than manually making a CSV file, you can use metabase to generate the CSV file for you.
## CSV Requirements
The CSV file does have some requirements around what is needed to be successful:
* Within the CSV document, the first row should be the title of the columns that are in use.
* Column titles should use underscores instead of colons (i.e, `cm_name` instead of `cm:name`).
* `sys_node-uuid` needs to be present
The simplest way to generate a CSV file is to use metabase.
### Mandatory Fields
The only mandatory field is the UUID of the node. This column should be titled either `sys_node-uuid` or `uuid`.
If there is no column for UUID then the bulk upload will not take place, as it won't be able to determine what node to act upon.
### Property Fields
Bulk Upload will only change values that are present in the CSV file, so if you want to change any property, then you will need to include this as a column when outputting from metabase. For instance if you want to change the name of a document, then you should include a column called `cm_name`.
Bulk Upload will not update values that aren't changed, so you can re-run the same CSV a few times if you want to do minor adjustments.
### Special Fields
As of version 2.1.0 there are two special fields: path and tags. The path field allows you to move nodes to another location, and the tags field allow you to update the tags of the node.
As of version 2.3.0, with the [Version Module]( installed you can adjust the version label of the node to a new value.
#### Tags
Tags are changed using the special `tags` field. Simply enter in the name of the tags you want, seperated by a `|` (pipe) character.
##### Example Tags
For example, if you want to have a node with two tags: `first` and `second`, then enter this value:
##### Clearing Tags
To clear a node of all tags, simply provide it as an empty value and the tags will be cleared.
#### Node Path
You can bulk move nodes by using the `path` field. Paths always start with `/Company Home` and individual folders are seperated by the `/` character.
Bulk Upload will auto-create folders if it can, or reuse existing ones if they are present.
##### Path Example
As an example, a node in an Example Site's document library would have a path like the following:
/Company Home/Sites/example-site/documentLibrary
Changing this value will move the node, creating folders on the fly.
So if we wanted to move nodes to a folder called `Example Folder` within `Example Site` the path would be:
/Company Home/Sites/example-site/documentLibrary/Example Folder
#### Versions Labels
Versions can be adjusted on nodes if you have the version module installed. There are some restrictions around this though:
* Due to the way Alfresco handles versioning, you cannot add a lower version. I.e, if your document version is `1.1` you can't set it to `1.0`
* As per the version module, you must use positive numbers, so a version label such as `1.a` won't work.
##### Version Example
Say you want to update all documents to version 3. You can simply add the following value in the CSV file in the `cm_versionLabel` column:
## Usage
Usage is broken down into the following steps:
* Exporting via CSV: Using metabase, you can export a copy of the metadata of each node. You can also filter down, only selecting the nodes and rows you want to have.
* Editing the CSV: You can edit CSV in any spreadsheet program such as Excel or LibreOffice.
* Importing via CSV: You can import via CSV by navigating to the admin tools.
### Exporting via CSV
You can either use Metabase to export CSV files:
* Navigate to Metabase, either via the `Reports` button within the Alfresco Share header if you have a custom install, via the URL provided
* Select `New Question` and then choose the `Documents` table underneath the Alfresco database
* Select `Get Answer`:
* Use the ![download](../images/bulkupload-download.png) Download icon to download on the right, selecting `CSV` as the type
### Editing the CSV
You can open edit the CSV just like a normal spreadsheet file using Excel or LibreOffice. Take care that the saved CSV file still meets the above CSV requirements when you have finished editing.
### Importing via CSV
Importing via CSV is done by a System administrator via the Share front end:
To Import follow these steps:
* Navigate to `Admin Tools`
* Select `Metadata Bulk Upload` from the tools section
* Choose your CSV file and click `Upload`
* Results of your changes and any errors will appear within a table
## AlfBI Installation Guide
The following guide is for the installation of AlfBI
## Requirements
### Alfresco
* Alfresco Version 5.0+
* Alfresco Dynamic Extensions
* AlfStream Version 1.9.0+
* Java Version 8
### Apache Karaf
* Apache Karaf 4.0+
### PostgreSQL
* PostgreSQL version 9.5+
## Compilation
Requires Read access to the Parashift Private Maven Repository
Run `gradle amp` in repo, share and camel directories
## Installation (Alfresco)
Install the following amps into Alfresco:
* Alfresco Dynamic Extensions
* Alfstream
* AlfBI (this module)
## Installation (Karaf)
Installation of this module requires an instance of Apache Karaf to host Apache Camel.
Karaf is normally stored within the `/opt/apache-karaf-*/` directory where `*` is the version number.
* Follow the instructions for installing Karaf from the Quickstart here: [Karaf Manual](
## Installation (PostgreSQL)
You will need a PostgreSQL database to store all the indexes. As an
example, you can create a db `alfbi` with user and password `alfbi`:
postgres=# create user alfbi password 'alfbi';
postgres=# create database alfbi owner alfbi;
By default alfbi will automatically create the necessary tables for you when it starts ingesting.
## Configuration (Karaf)
* Create a or update the config file at `/<path_to_karaf_install>/etc/com.parashift.cfg` with the following properties:
* `alfresco.url`: the Local alfresco instance (normally localhost, but could be a different server)
* `alfresco.username`: the Local alfresco admin username
* `alfresco.password`: the Local alfresco admin password
* `alfbi.username`: the username for the alfbi PostgreSQL database
* `alfbi.password`: the password for the alfbi PostgreSQL database
* `alfbi.url`: the URL for the alfbi PostgreSQL database
The default configuration uses the following:
## Configuration (Metabase)
After you have successfully pulled down data from Alfresco you should have the tables/databases necessary for creating dashboards.
The dashboard configuration is stored here: [metabase_dashboards.yml](
**Note**: If you are not using the default database username/password, make sure you update the yaml file with the correct values
Using `mbexport` you can import the dashboards like so:
mbexport -i metabase_dashboards.yml
## Configuration (Alfresco)
The following configuration options are available for Alfresco
### Metabase User/Group Synchronisation
**Note**: Requires that Metabase is running using PostgreSQL as a configuration store
The following options control the user/group sync in Alfresco global properties (with defaults):
The db params specify the metabase configuration database to connect to.
If the metabase db is not contactable, or there is an exception, then
metabase sync won't be enabled
### Metabase Reports SSL Redirect
**Note**: Requires that metabase is accessible via another domain
To configure the reports redirect in Alfresco share, user/group
synchronisation must be active as per above.
The `metabase.front.url` configuration option specifies the metabase
You will need to setup a proxy for metabase via apache like so:
ProxyPass /alfmb
ProxyPassReverse /alfmb
ProxyPass /
ProxyPassReverse /
The method of redirect is as follows:
* Users click on the `Reports` button within Alfresco Share
* This redirects them to `{{metabase.front.url}}`/alfmb?alf_ticket=`{{user_ticket}}`
* The proxymb webscript then performs the following:
* Checks if the user is valid
* Creates a session in metabase
* Sets the browser cookies to that session
# Parashift AlfBI
The Parashift AlfBI Module includes the
following features:
* Bulk Upload of Metadata via CSV
* Camel Components for synchronising the following indexes: Documents,
System and Audit Log
* Saves Banana Dashboards into the Alfresco Repository
* Metabase User/Group synchronisation with Alfresco
* A `Reports` header menu item which allows users to access reports via
Alfresco Share
## Installation
See the [Installation guide](./
## What is AlfBI?
AlfBI is a simple and powerful analytics tool which lets anyone learn
and make decisions from their company’s data. As a business intelligence
tool, it answers your questions about your Alfresco data, displaying the
answers in formats that make sense, whether as a bar graph, a pie chart
or a detailed table.
Typical questions include:
- How large is the repository, and how fast is it growing?
- How many files are there, and who is creating the most?
- Which site has the most activity, and by whom?
- How many login failures happened?
- Which files were the most viewed last month?
- Which files were added last week?
- Which files are locked, how long have they been locked for and by whom?
- Which geographical locations do image files refer to?
Questions can be saved for later, making it easy to come back to them,
and they can be grouped into dashboards.
AlfBI also makes it easy to share questions and dashboards with the rest
of your team.
AlfBI is built on Metabase, an open source tool, and interacts with
Alfresco, the world’s most popular Enterprise Content Management System.
The following websites provide further information on Metabase:
- <>
- <>
## Alfresco dashboards
AlfBI has been configured to connect to your Alfresco database.
A number of useful dashboards have been added, which you can access by
clicking on **Dashboards**.
### Alfresco Documents
The Documents dashboard answers common questions on the content managed
by your Alfresco system, including:
- The total size of the file system, in GB
- The total number of files in the system
- The total number of sites configured in Alfresco
- The breakdown of file size (in MB) per site, for the top sites
- The breakdown of file size (in MB) per document type (mime type),
for the top 5 types
- The number of files per sites, for the top 5 sites
- The number of files per document type, limited to the top 5 types
- The growth trend of file size over time
The dashboard can be filtered by:
- Filter by Site: type in the Site ID, and select from the drop
down list. Some cards on the dashboard will be faded, as the filter does not
apply to them:
- Filter by Creator: type in the username of the user, and select from
the drop down. This will update all the cards, and show the user’s file usage:
- Creation Date: select the retrospective time period to consider – it
can be set to any number of days, weeks, months or years. This will update all the cards, and show the document profile matching this time period:
* Document type: type in the mime type of the document. A full list is available from <>, but the most common are:
* application/msword
* application/onenote
* application/pdf
* application/
* application/
* application/
* application/zip
* audio/mp4
* audio/mpeg
* image/gif
* image/jpeg
* image/png
* video/mp4
* video/mpeg
* video/quicktime
Of course, filters can be combined.
Alfresco System
The System dashboard answers technical common questions on your Alfresco
system, including:
- The current CPU load, as a percentage value, computed as an average
over the last hour
- The current load on the memory allocated to the Java Virtual
Machine, which is an early indicator of performance degradation in
- The license information, including date, remaining time and number
of users
- The CPU load trend over the specified period
- The performance of the Java Virtual Machine memory over the
specified period
- The performance of the server’s memory over the specified period.
The only filter available for this dashboard is the date filter through
which the time scale can be varied.
Alfresco Users
The System dashboard answers common questions about the users of your
Alfresco system, including:
- The most active users, measured by activity (creates, reads, updates
and deletes), sorted by most active to least active
- The most active sites, measured by activity in those sites, and
sorted by most active to least active
- The busiest hours of the day, measured in activity, across the
- The timeline of activity in Alfresco, both in number of changes and
number of users on the system
- The breakdown of activity, per type of activity.
The dashboard comes with two separate filters:
- Creation Date: select the time period to consider – it can be set
- Today, yesterday, the past 7 days, the past 30 days
- Last week, month or year
- This week, month or year
- User: type in the username of the user.
## Exporting Document Metadata
Document information can be exported to an external file (CSV, Excel or
JSON), so that the metadata can be edited and re-imported back into
To download the metadata, create a new Question (see Section Asking
questions on page 5), using the **Alfresco Dataset** and the **Nodes**
Add the following filter to remove all deleted files:
Add any other filters, such as Site, Owner, Created date, as required.
Click Get Answer.
Finally, click on the down arrow at the top right hand side of the page:
and select the format required.
In the particular case of CSV and Excel, the file will be downloaded for
all matching files, and will include all the metadata fields in
Alfresco, displayed as columns in Excel:
Importing metadata into Alfresco is covered in [Bulk Upload](./
......@@ -40,6 +40,11 @@ pages:
- Module Installation: 'setup/'
- Paramp: 'setup/'
- Para Modules:
- AlfBI:
- Readme: 'paramodules/'
- Bulk Upload: 'paramodules/'
- Installation: 'paramodules/'
- Change log: 'changelogs/'
- AlfStream:
- Readme: 'paramodules/'
- AlfStream Sync: 'paramodules/'
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment