Logos Bible Software was created in the days of packaged software sold on physical media, before consumers had heard of the Internet. (It was 1991!)
Logos got its first Internet-enabled features in 1995, and over the years Logos has grown to be more and more connected to the Internet. Still, Logos 5 will run without an Internet connection, and users can (and do) have completely offline use of it. But most users are connected to the Internet most/all of the time, and we're designing future features in Logos to take advantage of this.
When you use a product that's delivered via a web site, there are certain assumptions you can safely make about what's being stored on the web servers: everything.
Every click, page view, search, IP address, time of visit, and bit of information typed into the site is stored. At a minimum, the standard "web log" functionality of the web server (standard since the first days of the web) is recording most of this info for every page view, and, since the entire site/application is on a remote server, all the information you type / enter is stored there, too.
Much of this info is recorded many times, at many places. Your web provider probably records and stores this info for months/years (so law enforcement can request it if desired) and the site may feed Google Analytics or another tool a copy of the data in order to get convenient reports/analysis.
People are rarely surprised by this. But it seems people are sometimes surprised, and even upset, to find out that desktop applications are now recording and reporting similar information.
Most desktop applications are, or shortly will be, completely integrated with web services. Even if an application does no explicit data sharing with a web service, simply checking RSS feeds, looking for updates / news / etc. generates web server logs that can be analyzed.
And most applications are explicitly interacting with web services, in order to deliver cloud-connected features, support synchronization between desktop and mobile devices, backup user data, access databases too large to store locally, etc.
Logos Bible Software has been interacting with web services for years. Early on it was simply retrieving news feeds and update notices, but starting with Logos 4 the application became highly integrated with web services.
We no longer think of Logos Bible Software as a stand-alone desktop software package. We think of it as a connected family of desktop and mobile software applications and online web services that help people study the Bible, alone or in community with others.
As a concession to "the missionary with the solar-powered laptop and no Internet connection", and to people who still want an isolated stand-alone software package, you can run the software with Internet access turned off. (It's becoming more and more difficult to maintain this functionality, but we'll try to keep it as long as we can.)
But our plan is to increase our use of the Internet to provide better functionality and new features, and we believe this will deliver real value to our users.
Things we do "online" and why:
Logos collects stats on the use of the software. At various times we've collected all kinds of different stats; at the moment Logos 5 collects less information than earlier versions, but we expect to hook up more reporting in the future.
These stats have led to actual improvements in our business and software. For example:
We tracked what percentage of users were on what operating system. This helped us know when we could drop support for old versions of Windows or Mac OS X, affecting few users and allowing us to allocate resources to new work instead of old OS support.
We tracked what percentage of users running the software each day had upgraded to a new version. It's useful to know when 80% of daily users are running Logos 5 -- we can stop promoting the upgrade so heavily. :-) If we weren't tracking the version used each day we'd only know the percentage of Logos 4 purchasers who had purchased Logos 5, and that might include purchasers who no longer use the software, distorting the data.
We tracked search queries. This is such a massive amount of info that the last time we decided to do some serious analysis on search queries we limited it to a single month. We sorted queries by frequency and looked to see how many used boolean operators, could not be parsed by the query engine, etc. We even just browsed them. (The document was a list of queries with counts -- no user identities.) From looking at a large aggregation of search queries we learned that boolean operators aren't used much, and were more likely to mess up a query than be used correctly. This led to the use of all-caps AND and OR as operators, reducing the chance that users would unintentionally include an "and" or "or" that messed up a query that was a phrase. We also saw people were searching for the names of holidays, like "Mothers' Day", which fed into our decision to develop the Preaching Themes database, which is used to tag resources -- and includes Mothers' Day and other holidays as themes.
We tracked which dialog boxes were used. This led to our decision to avoid dialog boxes in Logos 4.
We tracked which books were opened. This led to removing some books from collections, or keeping books in collections that we might otherwise have removed. It also helped us understand how important "smart" defaults were, in light of how strong an association there is between a book being the first reported in a tool and the one more opened.
These stats, when aggregated, offer value to Logos and help us make a better product. Many of them also feed back into features that benefit users:
We can offer "Sort by Recent" in mobile apps because the software stored what you opened when. We can offer the "auto-bookmarks" in the scroll bar of a resource, for quick jumping to a previously visited location. We can open a book on your mobile device to the place you were reading on another device because we sync your last read location. Soon we can indicate when you've read a book completely, eliminating the need to manually add a "read" tag in the library, as some users now do.
Moving forward, we plan to offer "crowd sourced" data that benefits all our users. (You will be able to turn off, or ignore, this crowd-sourced data if you don't want to use it.)
We modeled our star rating for resources on other widely used systems, like Netflix and Amazon.com and hundreds of other sites: you can apply your own star rating to any resource, which overrides any other rating. But if you don't rate something, by default you see the "community" rating. (And you can see both by hovering over the stars.)
Community tags supplement your own tags, and are intended to harness the "community" wisdom about a particular resource, helping you find things more easily and better understand your library.
(Both of these features were fully designed for Logos 4, but didn't make the development cutoff. When we finally shipped them in Logos 5 -- using the specs written for Logos 4 -- many users had already adopted their own meaning/conventions for tags and star ratings, and found the community data a distraction. We will be implementing a way to turn them off if you don't want to see this community info.)
These community features presently treat the entire user base as one community, but the intention has always been to introduce a "users like you" component to the algorithms, much like the way Netflix tries to tell you what their algorithm thinks YOU would rate the movie, not what everyone rated it.
Our hope is that we get enough data -- using voluntarily provided info like "denomination", and sales data like "what books you specifically purchased" -- that we could give you a star rating from "users like you", and weight the community tags in the same way. So a commentary set labeled 5 stars and tagged "reliable" by users of one denomination, say, would be reported that way to others of that denomination who bought similar resources, but might be rated "3 stars" and tagged "conservative" to users of another denomination who had purchased different resources. (The rating would probably differ, but you'd probably see all the tags -- they'd just be different sizes for different users.)
(This kind of recommendation system requires a lot of data in order to work, but with over 1 million users of our platform, we believe we can collect enough data to make it work in the future. And this is something we want to do in response to actual user requests: new customers often ask "what books should I buy?" or "can you recommend a commentary I can trust?" Or, "can you label the commentaries as conservative/liberal, or this-label/that-label?" We can't really do that in a way that's right for everyone, but we might be able to let "everyone" tell us enough that we can tell you what "people like you" think about this or that book. I'm sure this doesn't appeal to our 'power users', but I know it's highly requested by many new users. They want your opinion, power user!)
Popular highlights is another long-planned feature that aggregates many users' data (in this case, extracting the highlighted range, but not the text of notes or even the label of the highlighting style) to report which ranges of books were highlighted by many people. (The 'many' is dynamic -- in some resources it's 5+ users -- the minimum -- and in others there are so many highlights that a range isn't considered popular until 20, 50 or more users have independently highlighted it.)
Aggregated demographic data will be extracted and likely shared with some publishers and authors. I'm not sure how useful this actually is -- will knowing that a book is popular with people who use the Greek NT, or even with people who have identified with a particular denomination, be useful to an author or publisher? Will someone go run an ad in the denomination magazine as a result? I don't know, but I do know that authors and publishers love this kind of info. "Lutheran women read my book on Wednesdays on Android phones, but they all give up after chapter 6. What does it mean?!" :-)
We hope to extract other useful stats from the intersection of feature use reporting and user data. I can imagine doing an analysis to see what words in the Greek NT are most often right-clicked and looked up, or have a Bible Word Study run on them. (And/or which words were the headwords for user-edited Bible Word Study Guides.) From this we might be able to get a list of "words of significant interest", in which we could invest more editorial resources and/or new features. The "Interesting Words" section of the Passage Guide, presently built by statistical analysis of the text, could be informed by statistical analysis of user interaction, too.
In the same way (I'm making stuff up now) we might want to run an analysis of which verses in the Bible have the most user-written note text attached. This might tell us the passages we should be giving the most attention to in future updates of the Faithlife Study Bible, or the Evangelical Exegetical Commentary.
Admittedly, these features would require "looking at synced user data" -- but I hope you can see how "the looking" is done by algorithms and doesn't represent a privacy invasion. In fact, this type of analysis is only useful when it's on "too much data." We need the forest, not a tree, to see the patterns that help us design features and content.
Other ways we'll be introducing "community":
We're lighting up collaborative documents at http://documents.logos.com. This will eventually be enabled for almost every document type.
The "personal" use case is your being able to publish (read-only) or collaborate (shared editing and ownership) documents with any group you'd like. A pastor / professor / teacher could publish notes on a book of the Bible. Students could collaborate on a note document on a textbook. A scholar could collaborate on a highlighting project with a research assistant.
We hope to enable some forms of "community" data editing -- and even remote editorial work for compensation. For example, Logos 5 has some data sets that were created by tagging the biblical text -- we even used our own highlighting tool for some of the work. With collaborative documents, users could choose to join a tagging project on a text that Logos might not otherwise get to. Imagine if referent analysis, speaker labels, word senses, and clause searching were available for the Apostolic Fathers, Josephus, Philo, and all of the Perseus Project. These collaborative/social documents could help us distribute the workload over many contributors, track who made what contribution, and even pay for contributions in Logos credit.
This could allow students to "work for books" (a request we get surprisingly often) and help us offer richer data sets that we might otherwise not soon afford or have time to create.
We plan to make it easier to recommend resources and even to share quotations from resources. You can already tweet or share quotes from books, but in future releases you'll have the option to share a quotation from the book publicly, and resources will have online pages where you can see the publicly shared quotations before buying the resource.
See https://faithlife.com/markbarnes/resources as an example; it is a summary of Mark Barnes' reviews. The disabled tab for "Recommendations" is where I will be able to see all the books Mark has recommended (either publicly, or to a specific group that he and I are co-members of -- so he could recommend a particular book just to his church, or a class). On the "Quotes" tab I would see any quotes from the book that Mark had intentionally shared -- and, if I own the book, I'll be able to jump directly to that location in the book.
(Mark, I hope you don't mind me using you as an example -- you've written a lot of reviews. Thanks!)
We take user privacy very seriously; we offer a number of settings, you have the option to run completely offline, and we follow best practices like not storing your password at all. (That's why our CS reps can't tell or email your password, only reset it -- we literally don't have access to it.)
At the same time, though, we are committed to being a web-based, data-driven platform. We are no longer designing a stand-alone, isolated desktop application. Some planned features will require access to databases too large to deliver to user devices; you'll need web access to use them. We will be listening to our users, responding to their feedback and concerns, but like other web-based platforms, we will not necessarily be offering control over every individual setting. Some things come along with being web-based.
For example, you can choose to keep all your digital photos on your own machine disconnected from the Internet, or you can choose to upload them to Flickr. And at Flickr you may even have some settings about what info is shared with what users, or what permissions your photos are shown with. But Flickr will analyze all the uploaded photos to build a report of what cameras / phones are being used: http://www.flickr.com/cameras/ You can't say "yes, I want my photos stored on your server, but no, don't count them in your stats."
We are very careful and respectful of individual privacy, and we'll be offering some controls/options, but we aren't, for example, going to support "sync my data but don't count me towards the number of Mac OS X users."
The coarse grained control is turning off "Use Internet" in the Program Settings. The more fine-grained controls are still being decided on, and will reflect your input.
I hope this overview is helpful, and that you can appreciate the value that these social / community features add to the Logos platform, and hopefully to your study and investment as well.