diff --git a/Project.toml b/Project.toml index 75de30f..c19b4a0 100644 --- a/Project.toml +++ b/Project.toml @@ -3,7 +3,7 @@ name = "ReplicateBE" uuid = "671d9d50-c343-11e9-1a9c-fdd992682823" keywords = ["bioequivalence", "mixedmodel"] desc = "Mixed model solution for replicate designed bioequivalence study." -version = "0.2.0" +version = "1.0.0" [deps] LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" diff --git a/README.md b/README.md index 4117843..3ce1d3c 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -**Public validated Beta version!** This program comes with absolutely no warranty. No liability is accepted for any loss and risk to public health resulting from use of this software. +This program comes with absolutely no warranty. No liability is accepted for any loss and risk to public health resulting from use of this software.

@@ -87,25 +87,7 @@ All API docs see [here](https://pharmcat.github.io/ReplicateBE.jl/latest/api/). # Random Dataset -Random dataset function is made for generation validation datasets and simulation data. - -``` -randrbeds(;n=24, sequence=[1,1], - design = ["T" "R" "T" "R"; "R" "T" "R" "T"], - inter=[0.5, 0.4, 0.9], intra=[0.1, 0.2], - intercept = 0, seqcoef = [0.0, 0.0], periodcoef = [0.0, 0.0, 0.0, 0.0], formcoef = [0.0, 0.0], seed = 0) -``` -Generate DataFrame with random multivariate data. Where: - - - n - Subject number; - - sequence - sequence subject distribution, [1,1] is equal 1:1, [1,2] - 1:2, [2,3,5] - 2:3:5 ets.; - - design - design matrix (sXp, where s - number of sequences, p - number of period), cells contains formulation label; - - inter - Inter-subject variation vector for G matrix: [σ₁, σ₂, ρ], where σ₁, σ₂ - formulation inter-subject variance, ρ - covariance coefficient; - - intra - Intra-subject variation vector for R matrix:[σ₁, σ₂], where σ₁, σ₂ - formulation intra-subject variance; - - intercept - model intercept value; - - seqcoef - model sequence coefficient values (additive): length = s (number of sequences); - - periodcoef - model period coefficient values (additive): length = p (number of periods); - - formcoef - model formulation coefficient values (additive): length = 2; +Random dataset function is made for generation validation datasets and simulation data. Description [here](https://pharmcat.github.io/ReplicateBE.jl/latest/syntax/). ## Structures diff --git a/chagelog.md b/chagelog.md index aac8edf..0a1b34c 100644 --- a/chagelog.md +++ b/chagelog.md @@ -1,4 +1,4 @@ -- v1.0.0 +- v1.0.0 (unstable) * validation * test procedures diff --git a/docs/src/details.md b/docs/src/details.md index 19eb589..641c956 100644 --- a/docs/src/details.md +++ b/docs/src/details.md @@ -114,7 +114,11 @@ Where L is a vector of known constant, C – variance-covariance matrix of fixed ### Validation -ReplicateBE was validated with 6 reference public datasets, 24 generated datasets and simulation study. ReplicateBE version 0.1.4 and 0.2.0 is compliant to SAS/SPSS, values checked: REML estimate, variance components estimate, fixed effect estimate, standard error of fixed effect estimate. Validation procedures included in package test procedure and perform each time when new version released or can be done at any time on user machine. Confidence interval (95%) for type I error (alpha) is 0.048047 - 0.050733 (10000 iterations). No statistically significant difference found with acceptable rate (0.05) found (version 0.1.4). Testing procedures cover approximately [![codecov](https://codecov.io/gh/PharmCat/ReplicateBE.jl/branch/master/graph/badge.svg)](https://codecov.io/gh/PharmCat/ReplicateBE.jl) of code, and perform for each release on Travis CI platform: [![Build Status](https://api.travis-ci.com/PharmCat/ReplicateBE.jl.svg?branch=master)](https://travis-ci.com/PharmCat/ReplicateBE.jl). +ReplicateBE was validated with 6 reference public datasets, 25 generated datasets and simulation study. ReplicateBE is compliant to SAS/SPSS, values checked: REML estimate, variance components estimate, fixed effect estimate, standard error of fixed effect estimate. Validation procedures included in package test procedure and perform each time when new version released or can be done at any time on user machine. Validation results (REML & 90% CI) table can be found [here](https://github.com/PharmCat/ReplicateBE.jl/tree/master/validation). + +ReplicateBE include simulation utility, that based on generation multivariate distributed datasets. This can be used not only in purpose of the package diagnostics, but also in purpose of sample size estimation ets. Simulation using version 0.1.4 was performed with 100000 iterations. Confidence interval (95%) for type I error (alpha) is 0.048047 - 0.050733. No statistically significant difference found with acceptable rate (0.05) found. + +Testing procedures cover approximately [![codecov](https://codecov.io/gh/PharmCat/ReplicateBE.jl/branch/master/graph/badge.svg)](https://codecov.io/gh/PharmCat/ReplicateBE.jl) of code, and perform for each release on Travis CI platform: [![Build Status](https://api.travis-ci.com/PharmCat/ReplicateBE.jl.svg?branch=master)](https://travis-ci.com/PharmCat/ReplicateBE.jl). ### Installation and using @@ -170,7 +174,7 @@ ReplicateBE was developed to get mixed model solution to bioequivalence clinical ### Discussion -ReplicateBE not designed for modeling in a general purpose, but can be used in situation with similar structure. In part of datasets ReplicateBE showed better optimization result as SPSS. Also ReplicateBE based on direct inversing of variance-covarance matrix V, so computation of ``V^{-1}`` may be time expensive if size of matrix is big. This does not happen in bioequivalence study where size of ``V`` is no more 4 (4 periods). But in general this can be serious disadvantage. This situation can be avoided using sweep based transformations (Wolfinger et al., 1994). In ReplicateBE variance structure strictly denoted and can’t be changed, but it can be a target in package developing path. In ReplicateBE Satterthwaite degree of freedom (DF) not equal with SAS/SPSS DF estimate in part of datasets. +ReplicateBE not designed for modeling in a general purpose, but can be used in situation with similar structure. In part of datasets ReplicateBE showed better optimization result as SPSS. Also ReplicateBE based on direct inversing of variance-covarance matrix V, so computation of ``V^{-1}`` may be time expensive if size of matrix is big. This does not happen in bioequivalence study where size of ``V`` is no more 4 (4 periods). But in general this can be serious disadvantage. This situation can be avoided using sweep based transformations (Wolfinger et al., 1994). In ReplicateBE variance structure strictly denoted and can’t be changed, but it can be a target in package developing path. In ReplicateBE Satterthwaite degree of freedom (DF) estimate is equal with SAS/SPSS DF estimate for full-replicated basic bioequivalence balanced and unbalanced datasets (2x4x2, 2x3x2), and can be unequal in datasets with dropouts, results for half-replicated designs or 2x4x4 designs can be found in validation results table. For some designs less than 0.1% of dataset can't be analyzed in version v1.0.0 with default setting. In this cases advanced settings can be used to avoid this problems. This cases don't lead to wrong data or wrong results because model throw error and stop. ### Acknowledgments diff --git a/docs/src/syntax.md b/docs/src/syntax.md index 52cdbd7..2ed1e87 100644 --- a/docs/src/syntax.md +++ b/docs/src/syntax.md @@ -2,7 +2,7 @@ ### Simple syntax -Simple syntax can be used in general purpose. +Simple syntax can be used in general purpose. ```julia rbe!(df; dvar::Symbol, subject::Symbol, formulation::Symbol, period::Symbol, sequence::Symbol) @@ -10,7 +10,7 @@ rbe!(df; dvar::Symbol, subject::Symbol, formulation::Symbol, period::Symbol, seq ### Full syntax -With keywords you can define additional options. +With keywords you can define additional options. ```julia rbe!(df; dvar::Symbol, @@ -42,9 +42,30 @@ rbe!(df; dvar::Symbol, ### Not modifying syntax -If you use ```rbe!()``` function no data transformations done with ```df```such as ```categorical!()``` and ```sort!()```. +If you use ```rbe()``` function no data transformations done with ```df```such as ```categorical!()``` and ```sort!()```. ```julia -rbe!(df; dvar::Symbol, subject::Symbol, formulation::Symbol, period::Symbol, sequence::Symbol) +rbe(df; dvar::Symbol, subject::Symbol, formulation::Symbol, period::Symbol, sequence::Symbol) +``` + +### Random dataset + +Random dataset function is made for generation validation datasets and simulation data. + +``` +randrbeds(;n=24, sequence=[1,1], + design = ["T" "R" "T" "R"; "R" "T" "R" "T"], + inter=[0.5, 0.4, 0.9], intra=[0.1, 0.2], + intercept = 0, seqcoef = [0.0, 0.0], periodcoef = [0.0, 0.0, 0.0, 0.0], formcoef = [0.0, 0.0], seed = 0) ``` +Where: + - n - Subject number; + - sequence - sequence subject distribution, [1,1] is equal 1:1, [1,2] - 1:2, [2,3,5] - 2:3:5 ets.; + - design - design matrix (sXp, where s - number of sequences, p - number of period), cells contains formulation label; + - inter - Inter-subject variation vector for G matrix: [σ₁, σ₂, ρ], where σ₁, σ₂ - formulation inter-subject variance, ρ - covariance coefficient; + - intra - Intra-subject variation vector for R matrix:[σ₁, σ₂], where σ₁, σ₂ - formulation intra-subject variance; + - intercept - model intercept value; + - seqcoef - model sequence coefficient values (additive): length = s (number of sequences); + - periodcoef - model period coefficient values (additive): length = p (number of periods); + - formcoef - model formulation coefficient values (additive): length = 2; diff --git a/docs/src/testval.md b/docs/src/testval.md index a8d76fa..8d598cc 100644 --- a/docs/src/testval.md +++ b/docs/src/testval.md @@ -37,6 +37,10 @@ Generated datasets made with *randrbeds* function with fixed seed (could be repr * 23 TRTR/RTRT * 24 TRT/RTR +### 24 subject, №8 repeat + + * 25 TRR/RTT + ### Special cases * 101 SP1: TRTR/RTRT 1024 subjects, 2000 dropped observations @@ -122,7 +126,9 @@ formulation: T / formulation: R ### Simulation study - Following simulation was performed for version v0.1.4: + #### v0.1.4 + + Following simulation was performed for version v0.1.4 (100000 iterations): ``` using Distributions, ReplicateBE function simulation(num) @@ -165,3 +171,37 @@ simulation(100000) 0.04939 ``` Confidence interval (95%) for power: 0.048047 - 0.050733. No statistically significant difference found. + + #### v1.0.0 + + ```julia + task = ReplicateBE.RandRBEDS(;n=24, + sequence=[1,2], design = ["T" "R" "T" "R"; "R" "T" "R" "T"], + inter=[0.05, 0.04, 0.6], intra=[0.02, 0.04], + intercept = 1.0, seqcoef = [0.0, 0.0], periodcoef = [0.0, 0.0, 0.0, 0.0], + formcoef = [0.0, log(0.8)], seed = 0, dropobs = 0) + + @time result = ReplicateBE.simulation(task; num = 100000, seed = 730795390628834530, verbose = true) + ``` + + 1837.914923 seconds (3.99 G allocations: 1.092 TiB, 9.91% gc time) + Seed: 730795390628834530 + Number: 100000 + Result: 0.05078660225829358 + CI: Estimate: 0.05079 (0.04943 - 0.05215) + + ```julia + task = ReplicateBE.RandRBEDS(;n=24, + sequence=[1,2], design = ["T" "R" "T"; "R" "T" "R"], + inter=[0.05, 0.04, 0.4], intra=[0.02, 0.04], + intercept = 1.0, seqcoef = [0.0, 0.0], periodcoef = [0.0, 0.0, 0.0], + formcoef = [0.0, log(0.8)], seed = 0, dropobs = 0) + + @time result = ReplicateBE.simulation(task; num = 100000, seed = 730795390628834530, verbose = true) + ``` + + 1638.566348 seconds (3.77 G allocations: 907.478 GiB, 9.77% gc time) + Seed: 730795390628834530 + Number: 100000 + Result: 0.04997148203368122 + CI: Estimate: 0.04997 (0.04862 - 0.05132) diff --git a/examples/simulation.jl b/examples/simulation.jl index c41416e..424c7f8 100644 --- a/examples/simulation.jl +++ b/examples/simulation.jl @@ -1,9 +1,39 @@ using ReplicateBE task = ReplicateBE.RandRBEDS(;n=24, -sequence=[1,1], design = ["T" "R" "T" "R"; "R" "T" "R" "T"], -inter=[0.05, 0.04, 0.6], intra=[0.02, 0.02], +sequence=[1,2], design = ["T" "R" "T" "R"; "R" "T" "R" "T"], +inter=[0.05, 0.04, 0.6], intra=[0.02, 0.04], +intercept = 1.0, seqcoef = [0.0, 0.0], periodcoef = [0.0, 0.0, 0.0, 0.0], +formcoef = [0.0, log(0.8)], seed = 0, dropobs = 0) + + +@time result = ReplicateBE.simulation(task; num = 100000, seed = 730795390628834530, verbose = true) +#= +1837.914923 seconds (3.99 G allocations: 1.092 TiB, 9.91% gc time) +Seed: 730795390628834530 +Number: 100000 +Result: 0.05078660225829358 +=# + +task = ReplicateBE.RandRBEDS(;n=24, +sequence=[1,2], design = ["T" "R" "T"; "R" "T" "R"], +inter=[0.05, 0.04, 0.4], intra=[0.02, 0.04], +intercept = 1.0, seqcoef = [0.0, 0.0], periodcoef = [0.0, 0.0, 0.0], +formcoef = [0.0, log(0.8)], seed = 0, dropobs = 0) +@time result = ReplicateBE.simulation(task; num = 100000, seed = 730795390628834530, verbose = true) + +#= +1638.566348 seconds (3.77 G allocations: 907.478 GiB, 9.77% gc time) +Seed: 730795390628834530 +Number: 100000 +Result: 0.04997148203368122 +#E63 +=# + +task = ReplicateBE.RandRBEDS(;n=24, +sequence=[1,2], design = ["T" "R" "T" "R"; "R" "T" "R" "T"], +inter=[0.05, 0.04, 0.3], intra=[0.02, 0.04], intercept = 1.0, seqcoef = [0.0, 0.0], periodcoef = [0.0, 0.0, 0.0, 0.0], formcoef = [0.0, log(0.8)], seed = 0, dropobs = 10) -result = ReplicateBE.simulation(task; num = 200, seed = 123) +result = ReplicateBE.simulation(task; num = 10, seed = 6564683774737, verbose = true) diff --git a/src/randrbeds.jl b/src/randrbeds.jl index 68b98a9..9c3f1e8 100644 --- a/src/randrbeds.jl +++ b/src/randrbeds.jl @@ -247,7 +247,7 @@ function simulation(task::RandRBEDS; io = stdout, verbose = false, num = 100, l printstyled(io, "Iteration: $i, seed $(seeds[i]): $(err): ERROR! \n"; color = :red) end end - return RBEDSSimResult(seed, num, seeds, cnt/num) + return RBEDSSimResult(seed, num, seeds, cnt/(num - err)) end function Base.show(io::IO, obj::RBEDSSimResult) diff --git a/src/rbe.jl b/src/rbe.jl index 44a96b3..d41a50f 100644 --- a/src/rbe.jl +++ b/src/rbe.jl @@ -70,7 +70,9 @@ rbe(df; dvar::Symbol, sequence::Symbol, g_tol::Float64 = 1e-8, x_tol::Float64 = 0.0, f_tol::Float64 = 0.0, iterations::Int = 100, store_trace = false, extended_trace = false, show_trace = false, - memopt = true) + memopt = true, + init = [], + postopt = false, vlm = 0.8, maxopttry = 50, rhoadjstep = 0.15) ``` Mixed model fitting function for replicate bioequivalence without data preparation (apply categorical! for each factor and sort! to dataframe). @@ -86,6 +88,26 @@ with covariance matrix for each subject: V_{i} = Z_{i}GZ_i'+R_{i} ``` +Keywords: + +* dvar - variable +* subject - subject factor +* formulation +* period +* sequence +* g_tol = 1e-8 +* x_tol = 0.0 +* f_tol = 0.0 +* iterations = 100 - maximum iteration for optimization +* store_trace = false +* extended_trace = false +* show_trace = false +* memopt = true - memory optimization (function cache) +* init = [] - initial variance paremeters +* postopt = false - post optimization +* vlm = 0.8 - "link function" coefficient +* maxopttry = 50 - maximum attempts to optimize +* rhoadjstep = 0.15 - adjustment value for rho after optimization fail """ function rbe(df; dvar::Symbol, subject::Symbol, diff --git a/test/test.jl b/test/test.jl index 567aa33..8db865b 100644 --- a/test/test.jl +++ b/test/test.jl @@ -413,6 +413,15 @@ end @test ci[end][1] ≈ 0.854985 atol=1E-5 @test ci[end][2] ≈ 1.293459 atol=1E-5 + #25 + #TRR/RTT + rds = ReplicateBE.randrbeds(;n=24, sequence=[1,1], design = ["T" "R" "R"; "R" "T" "T"], inter=[0.5, 0.4, 0.5], intra=[0.1, 0.2], intercept = 1.0, seqcoef = [0.0, 0.0], periodcoef = [0.0, 0.0, 0.0], formcoef = [0.0, 0.0], seed = 10008) + be = ReplicateBE.rbe!(rds, dvar = :var, subject = :subject, formulation = :formulation, period = :period, sequence = :sequence) + ci = confint(be, 0.1, expci = true) + @test ReplicateBE.reml2(be) ≈ 140.37671499958694 atol=1E-5 + @test ci[end][1] ≈ 0.8208303872086634 atol=1E-5 + @test ci[end][2] ≈ 1.3228068920153602 atol=1E-5 + end @testset " # Simulation " begin